1dent1ty cHa0s: April 2007

Saturday, April 28, 2007

Quest for Speed, part 1

In a new series involving our neverending quest for increased metaverse performance I've begun gathering additional data and metrics to help predict performance for the MIIS database. See the original post - "MIIS Database Sizing - Learning from Exchange" for information regarding the I/O calculations.

Fine Tuning the IOPS for Rotational Speed

So, in a previous posting we talked about some recommended constants regarding the number of I/O operations per second (IOPS) based on rotational speed. We had the following table:

Rotational Speed	IOPS
7200 RPM	72
10,000 RPM	100
15,000 RPM	150

As it turns out, these are merely guidelines - if you want to fine tune these based on the actual specifications of your drive then you can use the following calculation I found here:

iops = 1000 (ms/s) / (average read seek time (ms) + (maximum rotational latency (ms) / 2))

Use the following table to lookup maximum rotational latency by rotational speed - we'll need this to fine tune our expected IOPS:

Rotational Speed	Max Rotational Latency
7200 RPM	8.3 ms
10,000 RPM	6 ms
15,000 RPM	4 ms

Now that we have this information we just need to find the Average Seek Time published for the drive type we're going to fill our array with. Referenced data is from HP's website:

	10k 3.5 U320 SCSI	15k 3.5 U320 SCSI	15k 3.5 SAS	15k 2.5 SAS
Avg Seek Time (across capacities)	5.4 ms	3.8 ms	3.5 ms	3.0 ms
Expected IOPS per spindle	119	172	182	200

So, as you can see that while the faster rotational speed does make a huge difference in performance, the move to the Small Form Factor (SFF) drives is also having the effect of speeding up the average seek times. To get an exact measurement you will want to find the exact specifications for your drive and use that in your calculations.

Tuesday, April 24, 2007

Complex Demonstrations or Why Presenters Insist on Suffering

So, this year marked my second year presenting at the Directory Experts Conference and both years have been very rewarding; however I took quite a different approach preparing for this year than I did in the previous one.

Last year I gave two presentations with very complex demonstrations with multiple virtual servers simulating an Active Directory environment and using MIIS to do things like migrate users and provision entitlements. The demonstrations involved lots of syncs, state checks, previews and debugging. I guess I was really looking for two things - the things that all technical presenters at least are looking to prove or demonstrate:

There are no tricks up my sleeve - what you are seeing is real and you should therefore be impressed, and
we're looking for the "aha" or "oo-ah" moment.

The Stark Reality

What I personally discovered was that as advanced and impressive as I thought the demonstrations were, ultimately the process of demonstrating it was not. Frankly, the type of attendees we have at DEC have already been there and done that. They've seen syncs, schema extensions, DC promotions and the like so you don't have to prove that these things work - you just have to demonstrate why your process or approach has value and deserves their attention.

For this years presentation I felt that I really skimped on the demo - I just poked around in Visual Studio and ran very little actual processes. All of the before and after stuff I had taken screen shots of and placed them inline within the presentation. I did this based on my poor experiences from the previous year and listening to other presenters comment on precisely the same issues. However, I was still concerned that maybe I had somehow cheated the attendees by not providing the complex and therefore impressive demonstration.

Shared Pain

Early on Monday (April 23rd) I had the fortune to speak with Joe Kaplan, a fellow MVP in the Directory Services Programming space, prior to his presentation on Programming the Directory in .NET for Longhorn and the Future. Joe was expressing some of the same concerns regarding his demos and the fact that due to time constraints he had to severely limit it's complexity. His solution was actually quite simple:

Demonstrate the code fragments and walk the audience through the pertinent bits
Prepare the audience for the types of things that could happen when the code is run and how the code mitigates those concerns
Display a snapshot (in this case a text file) with the captured output when it was run against a complex/robust environment

I think the approach was not only effective but managed to avoid the whole process of invoking calls the demo gods and waiting patiently while things cycle hoping that you don't crash the fragile yet complex environment running on a mere laptop. It accomplished the most important goal which was to hold the attention of the attendees and move the discussion to the appropriate points.

Epiphany

After watching Joe's presentation and a few other throughout they day, an idea began to form which resulted in a rather interesting epiphany (or at least interesting to myself) - not only are complex demonstrations really only impressive to ourselves, but this sort of thing is regularly faced and solved on the plethora of daytime cooking shows. Take your pick, but every cooking show you've ever seen has the same basic formula:

Introduce a new dish (concept)
Talk about the ingredients (components and interdependencies)
Demonstrate assembling of the ingredients (walkthrough of the pertinent bits)
Reveal a previously prepared dish of the same design (display the captured output)

Wow, so why do we continue to torture ourselves as presenters when complex demos rarely go well, and the standard practice for doing this sort of thing is already widely accepted and trusted?

Take a lesson from your favorite cooking show - don't try to cook the pie while your audience is waiting, show them one you already baked!

NOTE: If you're interested in .NET Directory programming, check out Joe Kaplan and Ryan Dunn's book - The .NET Developers Guide to Directory Services Programming.

Wednesday, April 18, 2007

Interoperability Matrix Updated for ILM 2007

In a previous posting I outlined the dependencies for MIIS, SQL, SQL Reporting Services, and Visual Studio. I've since updated it to now include the .NET Framework and the changes to ILM 2007. Since the table is now too large to post here, I'm embedding an image:

Caveats include:

* = Pre-SP1: Only as a connected data source, not as the metaverse, Post-SP2 as both
**= Only if the miiserver.exe config is set to force 1.1 CLR
*** = Can be used to report against, but must be hosted in SQL 2005

Notice that SQL Reporting Services forms dependencies of its own with Visual Studio and of course, Visual Studio carries with it dependencies on the .NET Framework. Hey, it's all interconnected!

Monday, April 16, 2007

MIIS Reporting Pack Announced for DEC 2007 Attendees

For those of you that will be attending DEC 2007 this year you will be able to download a free SQL 2005 Reporting Services report pack for use with MIIS. The Visual Studio 2005 source files will be released and you will be free to customize them for use in your own projects. Those attending my "Using SQL Reporting Services for MIIS Reporting" presentation will see first hand how the reports are crafted and learn the basic query patterns that are the basis for so many reports.

For those of you not lucky enough to attend the convention this year you will be able to download it the following week. Here is a sneak peek at one of the reports included:

Connectors Query Pattern

Connectors embody an implicit relationship between a record in the mms_connectorspace table and a record in the mms_metaverse table. The relationship is created by linking the object_id's from each table in a another table called mms_csmv_link. The entry in this table is the connector.

Trying to link the two tables without the mms_csmv_link table does not result in a join even if SQL Query Analyzer wants to link them because the field names are the same.

Only adding the mms_csmv_link table into your query will result in a valid INNER JOIN.

The following query can be used to query for connectors and is the basis for virtually every other query:

SELECT mms_management_agent.ma_name, mms_metaverse.employeeID, mms_metaverse.uid, mms_metaverse.employeeStatus,
mms_metaverse.displayName, mms_metaverse.employeeType, mms_metaverse.physicalDeliveryOfficeName AS location
FROM mms_management_agent WITH (nolock) INNER JOIN
mms_connectorspace WITH (nolock) ON mms_management_agent.ma_id = mms_connectorspace.ma_id INNER JOIN
mms_csmv_link WITH (nolock) ON mms_connectorspace.object_id = mms_csmv_link.cs_object_id INNER JOIN
mms_metaverse WITH (nolock) ON mms_csmv_link.mv_object_id = mms_metaverse.object_id
WHERE (mms_metaverse.object_type = N'person') AND (mms_management_agent.ma_name = @MAName)
ORDER BY mms_metaverse.uid

The finished rendered report looks something like so:

MIIS Database Sizing - Learning from Exchange

I asked one of our Exchange Architects, Jake Ballecer, to articulate what the Exchange community takes for granted - sizing Exchange data stores. No one thinks twice about throwing a hundred or so spindles at an Exchange implementation and they have the methodology to justify that spindle count - so why is it so hard for us when attempting to size the disk subsystem for an MIIS implementation? Let's see what Jake has to say on the subject:

Simplified Spindle Calculation for Exchange Storage Allocation

Since the disk subsystem is the most common performance bottleneck for Exchange 2003, Exchange engineers have learned to include adequate spindle counts in their storage designs. The fundamental design protocol is deceptively simple: (1) determine your Exchange IOPS (I/O per second) requirement and (2) make sure that you provide enough dedicated physical disks to match the IOPS value. It’s the classic supply and demand scenario.

An Exchange IO can be loosely defined as a read or write operation on a single database page. The IOPS demand value can either be measured or estimated. Measuring this value in an existing Exchange environment gives a great view of actual real world numbers. Use perfmon to log disk reads/sec, disk writes/sec, and disk transfers/sec on the Exchange database disks during periods of heaviest usage. This method will provide you with the actual IOPS value (transfers/sec) and the read:write ratio that we will need later on. Use peak or near-peak transfers/sec values to determine your IOPS demand number. Divide this IOPS demand number by the number of mailbox users that were actively accessing the database at the time of the transfer/sec reading. This is your IOPS/mailbox value for Exchange 2003 calculations. Use averages for read:write ratio.

To summarize measured values:

IOPS/mailbox = (peak or near-peak transfers/sec)/(active mailbox users at the time of peak reading)

Read_ratio = (average reads/sec)/((average reads/sec)+(average writes/sec))

Write_ratio = (average writes/sec)/((average reads/sec)+(average writes/sec))

If you need to estimate your IOPS demand value, you will need to determine if the mailbox users can be profiled as light, average, heavy, or hard core Exchange users. We’ve learned to use more aggressive estimates in recent years due to the evolution of corporate email usage. The most common light Exchange users are workers that only casually access email and have very small mailboxes. Use an IOPS/mailbox value between .2 and .4 for this category. Average Exchange users are workers who typically receive around 40-100 emails and send less than 25 per day. Use an IOPS/mailbox value of .5 to .8 for this category. Heavy Exchange users are workers that have Outlook open pretty much their entire work day and constantly access their mailbox and calendar contents. Use an IOPS/mailbox value between 1 and 1.5 for this category. Hard core Exchange users are heavy users that regularly send and receive messages larger than 512MB. Use an IOPS/mailbox of 2-2.5 for this category. For read:write ratios, use .75:.25 for typical environments or .67:.33 if your messaging environment is skewed heavier towards writing.

The last piece of the demand side of the equation is the RAID penalty on writes. Different RAID levels will write an Exchange page differently. RAID 1, 10, or 0+1 configurations will write a page twice due to mirroring. Due to this behavior, these RAID levels are assigned a write penalty value of 2. In the case of RAID 5, the write operation follows this sequence: the new Exchange page is written, the rest of the contents of a stripe is read, the parity is calculated, then the parity for the stripe is written. Due to this behavior, RAID 5 configurations are assigned a RAID penalty of 4.

With the values compiled above, we can calculate the IOPS demand for a given set of mailboxes (#mailboxes):

IOPS_demand = (IOPS/mailbox * #mailboxes * Read_ratio)+(IOPS/mailbox * #mailboxes * Write_ratio * RAID_Penalty)

The second component in the calculation is the supply part. We need to know how many spindles it will take to sufficiently cover the IOPS demand. The rule of thumb is that a single physical disk is able to provide 100 IOPS for every 10,000rpm. So that’s 72 IOPS for a 7,200 rpm drive, 100 IOPS for a 10,000rpm drive, or 150 IOPS for a 15,000rpm drive.

#Spindles = IOPS_demand / IOPS_per_Spindle

Note though that this number of spindles will cover the demand but will be at 100% performance capacity. Allow for growth.

Jake Ballecer

Technical Architect

Ensynch, Inc

So sure, Exchange doesn't use SQL and MIIS does, so how does this apply? Well, the methodology behind using disk IOPS to drive the sizing of the array isn't limited to just Exchange, or SQL, but any I/O intensive application.

Transactions vs Mailboxes

When MIIS begins to process a synchronization for a connectorspace object it begins a SQL transaction which includes all operations from the start of the sync through inbound attribute flow, provisioning, and export attribute flow for all other connectors associated with that metaverse object. The transaction commits at the conclusion of the sync as long as no errors occurred. For any given MIIS implementation, one SQL transaction may contain as few as one record up to the total number of connectors connected to that metaverse object plus one for the metaverse object itself. The number of objects processed for each sync might be expressed as such:

total_number_of_connectors + 1

So, if I have an metaverse object with 5 connectors then I have 6 objects to sync for every transaction although granted that depending on your rules not every metaverse object will have the same number of connectors. Whereas in the Exchange calculations we use # of mailboxes, for MIIS we would use the number of transactions executed at peak to calculate demand.

Calculations

At the very least I think the following information is crucial to obtain during development:

IOPS/Transaction = (Disk Transfers/sec [Avg]) / (SQL Transactions/sec [Peak])

This data would need to be gathered on the following items:

MIIS Database logical disk
MIIS Log logical disk
tempDB Database logical disk
tempDB Log logical disk

Furthermore, the read and write ratio's would need to be calculated for the same instances:

Read_ratio = (Disk Reads/sec [Avg])/((Disk Reads/sec [Avg]) + (Disk Writes/sec [Avg]))
Write_ratio = (Disk Writes/sec [Avg])/((Disk Reads/sec [Avg]) +(Disk Writes/sec [Avg]))

Last but not least, the IOPS demand would be calculated using the following equation for each given instance:

IOPS_demand = (IOPS/Transaction * Transactions/sec [Peak] * Read_ratio)+(IOPS/Transaction * Transactions/sec [Peak] * Write_ratio * RAID_Penalty [2 for RAID 1/1+0, 4 for RAID 5])

The following table illustrates the performance counters required:

Object	Counter	Instance
Logical Disk	Disk Reads/sec	MIIS DB & Logs tempdb DB & logs
Logical Disk	Disk Transfers/sec	MIIS DB & Logs tempdb DB & Logs
Logical Disk	Disk Writes/sec	MIIS DB & Logs tempdb DB & Logs
SQLServer:Databases	Transactions/sec	MIIS DB
SQLServer:Databases	Transactions/sec	tempdb

So far my initial calculations of existing implementations has produced realistic spindle counts but only time and much more evaluation will decide whether or not this approach is ultimately valid for MIIS implementations. I have started a conversation concerning general disk performance tuning in the MIIS TechNet forum, please post your own performance data in this thread if you've applied the above calculations.

Friday, April 06, 2007

Richard Wakeman : Foreign-Domain Group Management with MIIS

Richard's latest post on IDMCenter.com is nothing short of brilliant - he has extended the group management capabilities of MIIS/ILM and provided a method to manage members from trusted domains.

Foreign Security Principals

When working with Domain Local or Universal Security groups in Active Directory you have the ability to add a user or group from a trusted domain as a member. This allows a user in another domain to access your resource without the need to create and manage a separate account for them and is one of the cornerstones of Active Directory. As it happens, there are consequences based on the type of trust that is in place. When domains are created within an AD forest, they build transitive trust relationships with each other along a parent/child paradigm. If you want to add a trust to a domain in an external forest (where there is no cross-forest trust) or perhaps to an NT4 domain then you're building an external trust. It's when we attempt to integrate members from across external trusts that we end up dealing with foreign security principals.

Whenever you add a member from a trusted external domain AD is adding a new object of type foreign security principal to the local domain. It's this object that serves as the local placeholder for the object in the trusted domain. The reason why this has been a challenge for group management is that these are not objects of the user or group class and are not handled by default - Richard's solution addresses how to extend functionality to allow the inclusion of these objects!

Link to Richard Wakeman : Foreign-Domain Group Management with MIIS

Wednesday, April 04, 2007

MIIS SP2 Now Available

The much anticipated release of MIIS SP2 is now available for download. SP2 marks several milestones in the lifecycle of Identity Integration Server:

Full Support for SQL Server 2005 as both a connected data source and as the metaverse
Full Support for Visual Studio 2005 means the rules extension templates are now provided in VS2005 format requiring all new extension development be done in this version. Visual Studio .NET 2003 is no longer supported for new extension development or debugging of existing extensions once SP2 is applied. However, your existing compiled extensions will continue to function but you will have to convert them to 2005 if you wish to do any further development.
.NET Framework 2.0 is now required for both rules extension development and for operation of the MIIS server. You can, however, still run your existing 1.1 compiled extensions until you wish to make any changes. (Visual Studio 2005 and .NET Framework 2.0 form a dependency here)

If you're currently running the 1046 patch then there is not much additional in the way of hotfixes that you'll get by applying SP2 - the real improvements have been made partially due to the move to .NET Framework 2.0:

Reference attribute processing is 40% faster now
The ability to commit the changes executed in a Preview as opposed to rolling back the transaction

Missing in action are:

64-bit server support (SQL 200x 64-bit is supported, but only on a separate server)
Updated Password portal

Recommendations
Unless you specifically need one of the aforementioned additions, I would recommend that you include the SP2 update as part of a larger effort to migrate your MIIS deployment to SQL 2005 and Visual Studio 2005. You will gain much more in overall capability by uplifting to the 2005 platform than by applying SP2 alone. Here is my prioritized list of updates:

Apply .NET Framework 3.0 - 2.0 is required for the updated CLR components; however 3.0 contains the 2.0 CLR components and the bits for Workflow Foundation and Communication Foundation. You can apply this update prior to applying SP2 as long as you apply the miiserver.exe.config file update described here.
Convert to Visual Studio 2005 - Recompiling your existing extensions in VS 2005/.NET 2.0 CLR hasn't required any changes to my extensions to date, but there are some capabilities in the 2.0 CLR that you might find compelling.
Migrate the MIIS DB to SQL Server 2005 - Changing db platforms will require a bit more thought for large deployments which is why I listed it last. If you have a local SQL deployment and MIIS is really the only thing present then you have much less in the way of dependencies to deal with. I think most people will choose SQL Server 2005's Copy Database Wizard to perform the migration.
Upgrade to ILM 2007 - In May ILM 2007 will be available and your existing licenses will convert to ILM licenses if you have Software Assurance. When the bits become available you should update the new build so you can apply future hotfixes and updates to ILM as there should be no further updates to MIIS 2003 beyond SP2.

The dependencies have changed so look for another dependency matrix in the next day or so!