DAOS Chaos - Breaking the 50 Percent Barrier
Category Administration IBM/Lotus Domino
DAOS is amazing, and maybe revolutionary. The more I work with it, the more potential I see for its implementation. We've begun enabling DAOS on our Win2008 archive servers, and will be turning next to the AIX mail servers.
I had expected a 30% gain. I had run the DAOS estimator on a 2.3 TB archive server, and I've been using DAOS on some ND8.5 application servers for over four months. I felt comfortable with DAOS. Thirty percent seemed to me to be a safe region as an estimate. Prior to DAOS, I've analyzed a large sampling of our mail files, and found that attachments typically used 70% of the mail file. The first picture is a screen shot of a report that I use to examine mail usage with attachments. The slide comes from my Lotusphere 2009 presentation on upgrades.
What I couldn't figure out from my own analytic tools, was how much of the 70% was being used for duplicate attachments. Because my company encourages the auto-saving of all sent email, I was expecting a significant number of duplicates between the mail files. What I didn't expect, was a 60% reduction in storage use. When I saw the final numbers, I was stunned. I told someone that the results are in the range of "science-fiction."
DAOS recovered nearly two-thirds of the space of our SAN.
Midway through the conversion process I took a snapshot with the Notes Administrator Client, to chart the difference in storage usage between the two clustered servers. Both are Domino 8.5, Win2008 servers connected to our NetApps SAN. The graphs of historical statistics are positioned strangely, but it's readable with a few markers. The timelines goes from right to left, the most recent time being against the left margin. What's being measured on the Y axis, is the amount of free space. When the lines slope downward, close to bottom, then it's time to ask the Unix admins for the partition to be enlarged.
I enabled DAOS on a few directories (that's the first upward bump), waited a day, and then began the conversion for all the mail files. Each SAN partition for the clustered pair of Domino servers is 2.3 TB, with 2.043 TB in use for the NSFs. At the completion of the conversion, the DAOS enabled server went from 2.043 TB to .829 TB. That's a reduction of 60%.
Pulling the attachments out of the mail files, and reducing duplicate entries improves the entire architecture, and brings additional capabilities:
Technorati Tags: Lotus Domino, DAOS
DAOS is amazing, and maybe revolutionary. The more I work with it, the more potential I see for its implementation. We've begun enabling DAOS on our Win2008 archive servers, and will be turning next to the AIX mail servers. I had expected a 30% gain. I had run the DAOS estimator on a 2.3 TB archive server, and I've been using DAOS on some ND8.5 application servers for over four months. I felt comfortable with DAOS. Thirty percent seemed to me to be a safe region as an estimate. Prior to DAOS, I've analyzed a large sampling of our mail files, and found that attachments typically used 70% of the mail file. The first picture is a screen shot of a report that I use to examine mail usage with attachments. The slide comes from my Lotusphere 2009 presentation on upgrades.
What I couldn't figure out from my own analytic tools, was how much of the 70% was being used for duplicate attachments. Because my company encourages the auto-saving of all sent email, I was expecting a significant number of duplicates between the mail files. What I didn't expect, was a 60% reduction in storage use. When I saw the final numbers, I was stunned. I told someone that the results are in the range of "science-fiction."
DAOS recovered nearly two-thirds of the space of our SAN.
Midway through the conversion process I took a snapshot with the Notes Administrator Client, to chart the difference in storage usage between the two clustered servers. Both are Domino 8.5, Win2008 servers connected to our NetApps SAN. The graphs of historical statistics are positioned strangely, but it's readable with a few markers. The timelines goes from right to left, the most recent time being against the left margin. What's being measured on the Y axis, is the amount of free space. When the lines slope downward, close to bottom, then it's time to ask the Unix admins for the partition to be enlarged.
I enabled DAOS on a few directories (that's the first upward bump), waited a day, and then began the conversion for all the mail files. Each SAN partition for the clustered pair of Domino servers is 2.3 TB, with 2.043 TB in use for the NSFs. At the completion of the conversion, the DAOS enabled server went from 2.043 TB to .829 TB. That's a reduction of 60%.Pulling the attachments out of the mail files, and reducing duplicate entries improves the entire architecture, and brings additional capabilities:
- It means that differential backups are much faster. The attachments are placed in file folders (encrypted), each folder holding 40k files. Using differential backups on databases is not efficient, because a 2 G NSF file might have one document altered, and it will trigger a complete backup for the entire 2 G. However, with DAOS, (1) the mail file has been reduced in size and (2) the attachments don't change frequently. I'd expect differential backups to be at least twice as fast as before DAOS.
- NSF compacting is faster. Prior to DAOS, attachments were cut up into 64k chunks, and serialized into multiple fields. With DAOS, the attachments are no longer in the database, so compacting can zip along, unhindered with managing the extraneous data.
- Extracting an attachment from a document is going to be faster. Large attachments make for lots of chopping apart to squeeze them into the database, and lots of stiching-back-together when they are retrieved. With DAOS, the file attachment is stored just like an HTML file, and read much faster than if it was being recomposed from its many 64k parts.
- File maintenance will be faster (e.g., consistency checks).
- Disk I/O demands are reduced.
- Because the disk I/O demands are less, then indexing is faster.
Technorati Tags: Lotus Domino, DAOS
- 


Comments
Nathan, your zipp'd files are still just files and they will benefit, hugely, from DAOS. Unless of course, you suspect that your mail users are careful guardians, and ensure there is only a single instance of a file, in a single mail account, at a time.
I should have mentioned Paul Mooney's blog as a good source for DAOS { Link } . And be sure to check on the DominoWiki for some more advanced configuration tips { Link } , like coordinating the minimum DAOS threshold to the storage block size.
Posted by Jack Dausman At 07:05:14 PM On 05/27/2009 | - Website - |
Posted by Nathan Chandler At 12:02:53 PM On 05/27/2009 | - Website - |
For sure the advantages of DAOS are fantastic. Now I think it will be hard for IBM/Lotus to find something that's even better than it.
Having taken attachments out of NSF and making NSFs lighter I'd like to see what's next.
I've been a huge fan of NSFDB2 but now it's going to RIP. I'd love to see DB2/PureXML as a native storage solution for domino notes in DXL. Maybe it's science fiction... but you never know.
For sure DAOS is "revolutionary" and it's implementation is a masterpiece.
Note:
having DAOS at the server level (on mail.box) also mail routing performance improves a lot.
Posted by Daniele Vistalli At 10:11:30 PM On 05/26/2009 | - Website - |