Inside Microsoft CRM


Moving on

Well, sort of moving on. I'm moving my blog over to MSDN to be a bit closer to the rest of the development team. I'm not abandoning the Inside MS-CRM "column" though, I just thought it would be a little easier if I was closer to "home". So, join me over at MSDN and watch the RSS feed for more entries.

I'm moving

I've made the decision to move over to the official Microsoft world of bloggers. For now I'm going to assume that I can pretty much say whatever I want over there, but if things get weird I'm coming back.

Over time I'll try to figure out how to move my past posts over so they're not lost in the shuffle, but for now it'll look like I'm starting over. The added benefit is that I'll be closer to Jason and Aaron, two other members of the MS-CRM team.

I know I have a backlog of articles to write, but don't worry, I am writing them.


Where's Mj

I'm still around, but I've been kinda busy. It's funny, I set out to write this because I had a ton of stuff on the top of my head that I wanted to unload on the MS-CRM community. The problem is that there's just so much stuff and so little time that I've found it harder to prioritize the list. I'm getting closer though and will start writing more over the next few weeks while we're ramping to alpha for V2 (no, I can't give you a date, and no I won't give you a feature list, sorry). I've received a handful of great suggestions both publicly and privately, and I've been scouring the newsgroups for more of those nasty requests that keep popping up.

As I mentioned previously I've finally got an instance of 1.2 installed and running, but I had to break the rules to get it to work. The biggest rule I broke, and I still haven't decided if I want to get around it or not, is that I installed to a named SQL instance. While the core product just doesn't care the replication story to the Outlook client has all kinds of issues with named instances. But, the good thing is that I've got an off-campus installation running and I can start breaking things in ways that should help the community get on with configuring and supporting the 1.2 product while we busily work on the next release.

So, hang in there, I'll be writing more soon. Keep the ideas coming in too because they really help me prioritize what information is most useful. There's a ton of stuff to write about and I'm looking to you folks to guide that work. Look for a more complete serialization example, a forms XML example that adds a custom tab, and a more detailed callout sample.


More fun with XSD

Torsten Schuster asked about the XML serializer, but unfortunately it was pretty buried. I saw it, but I'm not sure anyone else did. It sounds to me like another case of old CRM XSD and the VS XSD tool, but I'm not really sure.

From what I gather the platform and APIs are behaving correctly. For example, on a sales order retrieve, the following column set can be passed in.


Which should (and in this case, does) result in the following XML.

<customerid name="STM Ltd." dsc="0" type="1">{3E8CDBD0-FD2E-409D-BB8D-39870AB689C1}</customerid>

The question is about the "customerid" element and why the serializer isn't pulling it into the hydrated object. I can only guess, but it sounds like the XSD doesn't have a correct definition for "salesorder" nor does it have a definition for "customerid".

Ideally, the sales order XSD should have a "customerid" element of type "tns:customerType" which references this XSD

<xsd:complexType name="customerType">
<xsd:extension base="xsd:string">
<xsd:attribute name="name" type="xsd:string" />
<xsd:attribute name="type" type="xsd:int" use="required" />
<xsd:attribute name="dsc" type="xsd:int" />

I can't guarantee that the VS XSD tool will cope well with the XSD that I talked about earlier. Although making the XSD tool deal with it is fairly trivial, I still prefer the other code generator.

Like I said a while back, I can't support any of this stuff, but I can lend guidance on occasion. Hopefully this information is enough to get things moving again. If not, well, maybe someone else can chime in.


Entity size is always a problem

Running into the customization ceiling when adding attributes? I feel your pain. I really do. The team down the hall from me is working quite hard on making some of this pain go away, and they've done a bunch of work in the query processor layer in the platform. There's a reason the limitation exists in V1.x and there's a reason it wasn't "fixed" earlier.

The original COLA (contact, opportunity, lead, and account) definitions were quite small and left a ton of room for extensions. One of the things we looked at was to allow customizations of the type where one could store everything one wanted in an XML document in the database. There were way too many problems with that approach (although there are some great upsides too). Simply put, any search and display is going to be a problem with the property bag approach. There really aren't any great mechanisms for telling <fetch> about the semantics of an attribute. It knows all about the entities, attributes, and relationships, but that's where its knowledge stops. The application, and most other display patterns (except reporting) would work fairly well because it's all just XML and XSLT, and writing another XPATH expression to reach into the bag and pull out the rabbit is a well understood problem.

The second approach was to allow physical attributes to be added to entities in the form of database columns. There are some problems with this as well, particularly around name collisions and upgrade scenarios, but none of those couldn't be overcome with some decent engineering work.

A little history lesson my help. This is an excerpt from a whitepaper I wrote when we first started looking at how to create an extensible product. This is really ancient history at this point so there's really no reason I can think of to not share it.

Several proposals are on the table to allow application developers to customize the storage characteristics (the tables and fields).

1) The approach taken for ClearLead 1.x (bCentral Customer Manager). Each interesting object (business, user, prospect, event) has a set of developer-defined named properties. This approach was an attempt at solving the problems inherent in approach 3. However, it quickly caused two severe problems. First, performance was horrible, each query required multiple outer joins to gather all the detail-level information. Secondly, the data stored rapidly exploded. Where it was be possible to store a single inbound email event record in a single row using an ntext blob, the CL model took the approach that all large data be broken into 2000 character chunks and stored individually. This required that any time this information was read or written, the data had to be reconstructed.

2) Expose a single, opaque, application-specific blob field on every interesting object. This has some appeal since it leaves all the interpretation to the application and puts the burden on the developer to manage and render this information as necessary. The drawback here is that the blob isn't quickly searchable and can't be indexed (full-text indexing is an option, but isn't quite mature enough to be relied upon).

Another drawback with this format is that simple queries against the data are difficult to construct and very expensive to run. For example, how would a query be constructed which found all contacts who brought a cat into a vet clinic in May and were serviced by Dr. Smothers. If this data is 'stuffed' into a single Xml blob, the format isn't controllable by the platform, so a generic query like this wouldn't be possible to construct.

A secondary problem with this approach is the opaqueness of the data, neither the application nor the platform have any knowledge of the document structure. The platform would need to be written with the document structure in mind to make any reasonable use of the data, in which case the extensibility mechanism is defeated. The application on the other hand may have knowledge of the structure, but may not have any guarantee on its structure. That is, the structure may need to be interpreted differently for each individual object. [If the application were to force a fixed document structure on each class of objects, that would reduce some of the problems.]

3) Supply a fixed number of customizable fields per object - say 5 or 10 sql_variant fields. The problem with this approach is that it breaks the zero, one, infinity rule. As soon as we present n fields to the developer they'd ask for n + 1. If we told them they had 255 UNICODE characters per field, they'd ask for 256. We can get around the second part of this problem by implementing the extra fields as a sql_variant, however this limits the field's usefulness by changing the meaning of the field in large searches.

4) Use a metadata-driven model and "hide" the physical model from the application and platform developers. The appeal here is that each developer can actually think about the problem at hand and customize the object definition to meet their needs.

For the longest time we (hell, I'll take the blame on this, this was my idea) were under the impression that spreading things across tables would be a way around the problem. That's one of the reasons that addresses are bound into the entities (although I really dislike that design, I have to say it does make sense at times). The issue is that SQL needs to create temp tables to hold the 'inserted' table data for the update triggers. While most people would never write all of the data to a record at once it is possible, and in those situations things will just break.

The way we get around this is to remove the updatable views and triggers altogether and use the metadata to construct the cross-table queries. Until that happens there's really no way around the 8k limitation (at least not in the supported world).

Looking back on this now I think I'd take either option 1 or 2 above. If I were to take the XML blob approach I'd likely work on an extension pattern that forced the extension author to describe the extension in terms of metadata (and in terms of an XSD) so the tools which manipulate metadata for presentation, query, and update operations would "know" how to interpret the data. It still doesn't solve the reporting problems and it likely won't until a reporting rendering engine can be built that knows about XML as source data and uses XPATH as the layout. There would still be problems with query, particularly around aggregate functions and ordering (what if someone wants to group on a element in the extension and order by another element - they're not columns to SQL so you'd need to lift that functionality out of the database where it should be and into an independent query processing layer...)


Third time's a charm

Well, I'm back for a while. I just finished up my M2 features for our next release. Time to hand it off to the Test team for a while and let them beat on it. But, more on the V2 stuff later, after alpha, or beta, or something where we've made the feature set public. I don't want to say anything for fear of getting everyone excited about something and then having it cut at the last minute.

But, anyway, what's the title got to do with that? I've been trying to install the 1.2 MSDN release. Normally that wouldn't be much of a problem, I've probably installed it at least a few dozen times (and now that I'm focused on dev work for V2 I'm installing V2 sometimes three times a day - and yes, the new setup is incredibly cool and easier to use). This time was different. I was trying to install on a VPC running Windows 2003 Server. Nothing particularly special about that VPC, it's a domain member and the database server will be running somewhere outside of a VPC. What was different was that I had installed the Whidbey community preview which dropped the 2.0 .NET Framework and I had forgotten to tell Windows that it was going to be just fine if ASP.NET was enabled.

But that's just the start of the problems...

The first time I ran the install I went through the usual stuff. Fill out the page, click next, read the complaint about Windows feature X not being installed, install feature X, click next, and try again. It didn't take long to work through that since I knew what to expect after the first complaint. What I didn't expect was the flat out failure while creating the databases. Since I wrote most of the V1 database install code I knew where it was when it failed, but I sure couldn't figure out why from the message. I mean, come on, the step it was working through was pretty simple. Turns out that when setup is rebuilding the foreign key constraints in the database it simply walks all of the ones it finds in the current database...

Well, I turned on verbose setup logging (/l*v logfile if you haven't seen it yet) and tried again. Sure enough, it was failing at step '03 fknfr'. I stared at it for a while, cleaned everything up, fired up SQL profiler, and tried my third install of the night. Profiler showed that, yes, it was failing there. So I grabbed the script off the installation CD and tried to run it manually (but not before turning off the actual exec calls and replacing them with print) so I could see why things were failing. Well, that was a weird experience. I ran the script and the table names that were scrolling by were only vaguely related to the ones I would normally find in CRM. Oh, don't get me wrong, the expected tables were all there, but so were a bunch of other tables that I wasn't really expecting. To say the least I was getting nervous because the table names shown were actually tables from another very unrelated database on the same server and I started thinking that somehow setup had changed to an incorrect database context and trashed something. To make this a little less painful let's just say I spent quite a while staring at things until I remembered that my 'model' database on that server was actually set up to create a completely different type of table structure for each new database. Since CRM creates three databases during setup, it got seeded with three complete databases before it even started installing it's own stuff.

The moral of the story is simple - make sure you have an empty 'model' database or very bad things might happen.

By the way, even after switching database servers, I was still unsuccessful installing the product. Sure, things seemed to be working well, but then I ran into some weirdness with importing the default data (aka roles, privileges, and other goodies). Even after what seemed like a complete install things didn't work, which I'm attributing to the Whidbey bits cluttering up that server. I'm also assuming that switching from a 'normal' SQL installation to one running in a named instance is causing me some grief.

As I'm recounting this fun I'm watching a new Server 2003 installation happen in the background. This time I'm going to make sure all the stuff that's needed is installed when I start and I won't install anything else. Guess I should have read the IG...


Code generators

I opened my inbox this morning to find a link to a brand-new version of the XSD-to-code generator that I've been talking about. Just wanted to share the good news. There was a strong feeling that taking a dependency on a code generator from an outside team would be a bad thing because a) it wasn't built "here" and b) because it was a code generator. Well, I guess none of that matters now...

If you're doing any work with the platform XSD (not the as-shipped ones, we know they're not friendly), then grab this tool and give it a try. I use it every day in unit test development - no more building XML strings for this developer.


Programming models

Some days I really wonder what's right for the ISV / VAR community when it comes to programming models. I know what I'd like to see if I were writing code against MS-CRM. The problem is that I don't know what you want.

We have this great infrastructure at MSFT that allows us to stay in touch with the community. It works well when the contacts are asked the right questions and when the PM knows what to listen for in an answer. It breaks down when the questions are too generic, too specific, or flat out wrong. That's why I don't really know what you guys want in a programming model.

In MS-CRM there's only one way to go after the platform and that's using the SOAP proxy. Sure, it's possible to use the WSDL and generate client-side code directly, but it's a pain because of name collisions and all the other goo that happens when you add a web reference. There are also the well-known problems with the interfaces as they stand today which I've commented on before.

We've spent a lot of time talking about the ideal interfaces, programming model, and interaction model for the next releases. The problem is that we're talking about it but your voices aren't being represented anywhere. So, I'd like to see if 1) anyone's interested, 2) anyone's listening and 3) if anyone wants to comment.

Option 1 - we leave things the way they are. I won't go into this because everyone understands how things work. You get to keep Intellisense on the API signatures but everything else is a string.

Option 2 - clean up the interfaces by choosing a better naming convention, getting rid of the XML strings, and removing some of the 'extra' parameters. You get IntelliSense on the API signatures here to, but no more strings. (But no IntelliSense on the entities themselves). This one saving grace here is that the "objects" are really extensible. Because they're nothing more than a property bag it's very easy to add new attributes without breaking things (this is one of the reasons we have XML strings today).

Option 3 - get really radical and move to a type-safe, SOA-based model with only 6 methods but with a pile of messages. You get full IntelliSense here including type-safe entities and interfaces. The price you pay is dealing with the extensibility problem (i.e. what happens to your entities when another customer modifies the entity schema - this might be a recompilation or a property bag over the extra attributes).

Option 2 might look a lot like the interfaces do today with the exception that they'll take an "object" instead of XML. This object simply wraps a property bag of strings. You'd get run-time type checking like you do today but with the expense of either having a client-side copy of the metadata or by round-tripping to the server.

The code might be something like this (after adding a reference to the metadata, entity, and service assemblies):

AccountWebService accountService = new AccountWebService();
Account account = new Account(); // or BusinessEntity

account["ownerid"] = ownerid;
account["owneridtype"] = 8;
account["name"] = 64;
account["customertypecode"] = "21"; // is this a string?

string accountId = accountService.Create(account);

Principal p = new Principal();
p.Type = sptUser;
p.Id = someOtherUserId;

uint rights = 1; // READ?

accountService.GrantAccess(accountId, p, rights);

As you can see there's an account "object" on the client, but it's really just a name. The real class is BusinessEntity which is a thin wrapper around a NameValueCollection.

The third option would look structurally the same, but there would be a few key differences. This is after adding a web reference to the CRMWebService (which would get the interfaces and schemas).

CRMWebService crmService = new CRMWebService();

// get a new account instance with default values
Account account = crmService.CreateNew(typeof(Account)); = "account name";
account.customertypecode = 21; // this is an int

Guid accountId = crmService.Save(account);

GrantAccessMessage grantAccess = new GrantAccessMessage();
grantAccess.Moniker.Type = typeof(Account);
grantAccess.Moniker.Id = accountId;
grantAccess.Principal.Type = sptUser;
grantAccess.Principal.Id = otherUserId; // Guid
grantAccess.GrantRead = true;


I won't cover the V1.x flavor because I don't feel like typing &lt; and &gt; all over the place. That's just too much work.

I'd like to hear from you which option is prefered (and option 1 is still an option). If there's another model you'd prefer, I'm listening. Just don't ask for DataSets everywhere, please.