Thursday, 18 October 2012

Fix for VS build error: The "FindRibbons" task could not be loaded

I opened an Outlook customisation/plug-in project to do some modifications today and on building received the following delightful error message:

The "FindRibbons" task could not be loaded from the assembly Microsoft.VisualStudio.Tools.Office.BuildTasks, Version=, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a.  Confirm that the <UsingTask> declaration is correct, that the assembly and all its dependencies are available, and that the task contains a public class that implements Microsoft.Build.Framework.ITask

Wow, highlight of my week!

It looks like the issue is related to installing VS2012 side by side with VS2010, it updates the dlls relating to MSBuild, but not the references held in the standard configuration file. The Microsoft forums (via Google) found me my answer in a reply by 'sriram_electric' (thank you!):

For those who encounter this problem, goto C:\Program Files\MSBuild\Microsoft\VisualStudio\v10.0\OfficeTools and open the Microsoft.VisualStudio.Tools.Office.targets file.
Replace all with

In my case I had to go to Program Files (x86) as I'm running a 64bit OS, but other than that, and a quick restart of VS, the problem was resolved.

If this doesn't work for you, there are other suggested solutions given in the thread above.

- rob

Tuesday, 2 October 2012

Upgrading SharePoint 2007 workflows to SharePoint 2010

As part of an ongoing migration project I'm looking at shifting a bunch of sequential and state machine workflows written for SharePoint 2007 (in VS2008) into SharePoint 2010 (which uses VS2010). The business logic behind them is sound enough and I'm on a relatively tight time-scale, so it's not feasible to carry out a review of these processes at the same time (otherwise I'd jump at the chance to LEAN them up, whilst they are relatively sound in respect of their reflection of current business practice, I wouldn't claim that the business practices they support are ideally optimised!)

There's a good blog post on how to do it here:

To go with this I'll also need to update the associated Infopath 2007 forms to 2010, again, someone else has done the hard work of working out how, here:

I'll be reporting back soon once I've verified that both these approaches actually work!

- rob

Friday, 6 July 2012

Friday, 29 June 2012

BISM and application development convergence - the holy grail of data?

A thought struck me whilst at a session on DAX at TechEd2012 on DAX and its evolution from a functional thing within Excel, to a full-blooded query language.

This in itself is cool and interesting, but with all of the sessions that I've been attending on the Business Intelligence Semantic Model area (BISM) and the way that data experts can build rich Tabular semantic models using PowerPivot the following thought occured:

When will the development and analysis usages of data found themselves on a common model? At what point will a developer make use of the same BISM as someone building a report on that data? Surely this is the holy grail of data interaction and usage, an assurance that the developer benefits from the understanding of data that a business analyst has and that the business analyst can benefit from applications that have been implemented on that common understanding of the purpose of data.

Wednesday, 27 June 2012

SharePoint: buy the platform, sell the app

In discussions with a few folk regarding SharePoint over the last couple of months I have come to the conclusion that most businesses (and Higher Education is no exception to this) purchase SharePoint on the promises of a slick, shiny demonstration to senior managers (probably at one of those Caribbean 'conferences' I'm always hearing so much about) and the annoying fact that you can, more or less, do anything with SharePoint.

#TEE12 Day 1 roundup: SharePoint BI&dev and awesome Agile estimation

Today I attended the following:
  • Keynote
  • Creating Self-Service BI Solutions with SharePoint 2010 (ooh on blog and everything!)
  • Developing and Managing SharePoint solutions with Microsoft Visual Studio
  • Making Agile Estimation Work
Below are the major bits remembered off the top of my head (the transformer is recharging!) To take them one at a time:

Thursday, 21 June 2012

Conclusions from #XCRI #CourseData Development Day

A good day was spent examining the major elements of technical implementation at the XCRI-CAP development day in London yesterday. Here are my collected thoughts on the major themes.

Wednesday, 20 June 2012

#JISC #XCRI #CourseData - Development Day

Today is development day for stage 2 of the JISC course data project:

I'm particularly interested in how people are approaching implementation of course data feed information within the context of the set of policy issues that the course data project raises.

The day promises to provide a bit of networking with other folks working on development of feeds and interaction with aggregators, demonstration of various validation and aggregation tools as well as some code sharing activity.

Hopefully this will serve to kick start some of the work left to do for our particular project instantiation!

- rob

Thursday, 2 February 2012

Term sets or External lists in SharePoint 2010?

Term sets

Term sets are an interesting concept within the Managed Metadata Service (there's a great walk through of the Managed Metadata Service, including term sets by Andrew Connell). I particularly like the distinction between local and global term sets as it allows the specification of terms at a global level that can then be consumed across separate site collections.

How could you use this in the HE context? Well, potentially you could create a term set that contains the Faculty<->Department relationship such that anyone looking for content regarding a specific department could have assistance when using search because those adding content have assistance when completing metadata.

You could also potentially recreate the entire course catalogue by subject area (a Courses term set subdivided by course area and then listing each course) and this itself could be made useful when listing data elsewhere on the site.

The risk to me with term sets is that they could potentially tread on the toes of LOB data, leading to inaccuracies and inconsistencies across pure LOB systems and those based on the SharePoint platform. Sure you can put synchronisation in place, but it strikes me that unless the business data only exists inside SharePoint, then an external list is probably a better option than attempting to keep term sets in sync with external database (or other) content.

External lists

External lists are nice, but do have the overhead of requiring you set up an External Content Type. Fine if your data environment is a simple one (for pulling a list of sales countries out of a sales DB, for example) but not so simple if your data contains complex relationships.

If all you want is for users to be able to use that list as a read-only source of metadata terms, you'll probably be able to get it working readily even in complex scenarios, but external lists give you the option of CRUD functionality and somewhere down the line you just know your users are going to say "we like this AJAX rich UI environment, please can we update stuff here rather than in the grey box of our MI system?" (ok, they're unlikely to put it like that, but you know what I mean!) There are some very useful resources on creating external content types with CRUD methods but to claim it's a straightforward process would be rather misleading. As a dev I tend to underestimate the impact that breaking open Visual Studio has, but let's be honest here, if you have to go there it's going to take longer and more time = more resource cost, resource that from the businesses point of view may be better spent elsewhere.


So, where does this leave us then? Typically with the SharePoint stack: stuck in the middle. 2010 is a frustrating release in many ways, it gives us that extra step along the line in a number of areas (BCS is a million times more attractive than BDC ever was, for example) but still there are points where you just wish that the SharePoint product development cycle was aligned better with the .Net framework development cycle at MS HQ. One of the reasons that BCS isn't quite there yet is entirely due to the fact that it can't use any of the nice Entity Framework stuff introduced in .Net 4 and that's because 3.5 was the nailed down version for the development of SP 2010.

SharePoint is such a great platform that I get quite annoyed when MS drop something in that looks on the surface like an out of the box feature that works as a great short-cut to a time-consuming Visual Studio based process; I know by now that, if not properly prepared for, those are the features most likely to lead to scalability and performance issues further down the line. This for me is the number one question that every SharePoint dev should be asking of choices between routes like Term sets vs External lists (btw I'm not suggesting this is a blanket either/or in this case, just for this example of lookup data): if I take route A (the short-cut) am I potentially compromising the usability and scalability of my SharePoint application once users really start using it? Of course the most annoying thing about attempting to answer this question is knowing what the issues are with route A. Funnily enough MS are slow to point out the short-comings of its new feature set and the community can only work so fast to unpick them all for you. It can be a painful process figuring it out but as always with these things: test, test and test again. Then have a beta, then test again...

SharePoint short-cut features tend to encourage the quick roll-out of very useful functionality, the giant caveat emptor that should always be in play is "what's my scale up route?"

Monday, 23 January 2012

Use of Remote BLOB storage in SharePoint 2010 to reduce backup overheads

King of the snappy and interesting blog post titles, that's me...

Our current 2007 storage topology is, to be frank, not fit for purpose. Too much content sits inside a single site collection, which means a single database and understandably the chaps that run the storage infrastructure, especially the backup and disaster recovery set up, aren't happy at having to back it all up overnight, every night.

Part of the purpose of our ongoing migration work is to look at this set up and see if there's anything we can do to make it a little more sane and manageable for backup. One option is to look at persisting all of the main document storage (which is to be split more sensibly across a multitude of content silo databases) via Remote BLOB (Binary Large OBject) Storage (RBS), something new to SP 2010, though not new for the world of SQL, which uses the FILESTREAM option in the background to carry out this cleverness and has been since SQL 2008.

On the surface this would be a nice neat solution for our situation and looking at the planning documentation makes me think that there are scenarios where it could work nicely, but you obviously have to be very careful about how you use it as there are some key limitations to be aware of.

Key considerations

For our context, the critical issues to note about RBS enabled databases are:

  • 200GB limit per database (well, ok, not actually a 200GB limit, but in our situation, as this article explains, we would have to carefully benchmark our IOPS to go past 200GB up to the max 4TB enabled by SP 2010 SP1)
  • No encryption supported, even transparent DB encryption
  • RBS is bad for situations where your writing lots of small BLOBs (< 256KB), it's good for situations where you have fewer larger BLOBs (>256KB)
  • Does not support database mirroring for FILESTREAM enabled DBs. You cannot have FILESTREAM filegroups on the principle mirroring server
Obviously your issues may differ and the fit between your data environment and RBS may be better or worse. 

No database mirroring? Erk!

The last issue suddenly set off alarm bells to me as we're intending to use a mirrored DB setup as the SQL backend for our SP infrastructure. This makes perfect sense as mirroring works by sending transaction logs directly from the principle to the mirror server and BLOBs don't exist in transaction logs and the mirror server cannot (by definition) claim access to principle server contents. So we're at a little bit of impasse, we can hardly claim to be improving the resilience and DR strategy of our SP infrastructure if we have to ditch DB mirroring as an option.

Alternatives to mirroring

There are some alternative routes available for DR/resilience. FILESTREAM does work in concert with fail-over clustering in SQL 2008 R2, but clustering is less performant than mirroring when it comes to providing resilience (even though it's probably quicker in general operation than mirroring as it doesn't have a principle 'waiting' for the mirror to catch up). Recovery in fail-over clustering is slower and there is no load-balancing of effort. It's for this reason that most big orgs use mirrored clusters, giving them the best balance of immediate switch-over and good overall DR.

We're not in that situation and in any case it wouldn't resolve the FILESTREAM issue, only compound it, so at present I really don't know what the best option is going to be. Log-shipping also works with FILESTREAM but I'm not keen on that option as again fail-over is slow and even worse, manual.

So, this leaves us (me) in something of a dilemma. We can't take advantage of FILESTREAM, which is required for RBS unless we ditch the idea of database mirroring as part of our resilience and DR model. The alternatives to mirroring have their own draw-backs that I'm not sure I'm happy to compromise on.

Interestingly (though of no help here as it's in RC0 still), SQL 2012, 'Denali', brings a new option to the table  in the form of AlwaysOn Availability Groups, which do combine with FILESTREAM (though whether SharePoint will support this is not yet obvious).

In summary then I'm likely going to need to talk very nicely to my systems backup guys and hope they don't baulk at the prospect of not being able to do differential back ups of nice neat files sat on disk...

- rob 

Folders are so last version...

So, I haven't mentioned it but we're currently engaged in (finally!) getting around to upgrading our infrastructure from MOSS 2007 to SP2010.

It's been a very long time coming and it's providing an opportunity to have a really good look at the SP farms we already have, how they're structured and to correct a few less-than-best-practices that we've fallen into.

We're working with an excellent consultant who I'm working with to try and get the most out of this transition across a number of areas. I want this to be an opportunity seized rather than suffered and I always try to view an experienced external perspective as an opportunity take on board deficiencies, learn best practice and implement solid plans for ensuring the long term resilience and scalability of the SharePoint platform at my organisation.

Anyway, one of the things that he's mentioned in passing is a new feature in SP2010 called 'Document Sets'. Doing some reading around on the subject (see how they workhow they compare to folders and what they really are) I think I'm almost sold on the idea of ditching the current storage architecture, which is heavily reliant on folders for content segregation, in favour of the document set approach. I wanted to take the time to explain why folders were the answer for me in 2007 so that I can better explain why document sets seem to be a better option when shifting to 2010.

The existing environment uses folders

The prevailing view in the SharePoint community (and I'd guess in any context where good document management systems are in place) is that folders are dead and metadata is king. I completely agree with this sentiment, which is why I used folders in our 2007 environment.

Wait, what? Well, as I say above, for me folders in 2007 were never about content organisation, that is to say, they were not a useful route for information discovery (except in some very rigid circumstances). Rather, they were a very useful route for achieving with a minimum of fuss a way around one of the most annoying limitations of 2007, the well-known 2,000 items in a container limit. Folders are one of the quickest ways of getting round this and the architecture I put in place to observe this recommendation ensures that no user can accidentally navigate to a page where they experience horrendous slow down as they try to view 10,000 documents. So for me, folders were not dead, in fact they were very much alive; not as organisers of content but as a way segregating it. All access to content is via indexed metadata properties and, essentially, customised data view web parts that manage the display of searched content.

Document sets as potential folder alternative

Now that document sets have come on the scene I feel I have a potentially better tool at my disposal for organising content related to the metadata 'objects' in existence. I'm especially interested in the 'Welcome Page' concept as it looks as though it would be a very neat way of managing some of the functionality I've already put in place in the 2007 infrastructure, by displaying other related LOB data with the documents in the set. Anything that makes the rolling up of these sorts of 'role-based' views easier is definitely a step in the right direction in my book. The content routing feature would also hopefully take the place of my existing 'uploads triage' system, allowing users to upload docs that are automatically routed to the correct document set. It's a bit of a fudge by all accounts, but it could be extremely useful; though it would necessitate some pre-validation of metadata to ensure that users can't enter invalid IDs, or IDs that are valid in formation but don't refer to a real applicant/student.

Allotting a document set per student and per applicant, or, even generating a document set at the applicant stage that then gets translated into a document set for the student, will give a lot of flexibility when it comes to managing the display, location and eventual archiving of student documents.

Document set limitations

However, there do appear to be a few fairly critical limitations related to document sets that may make this a tricky transition.

One thought I'd immediately had with document sets was: how about an overarching set that has subsets relating to the applicant and student phases of the student life-cycle. Well, no dice, you can't nest document sets (well, ok, I can see why, they shouldn't just become a folder-like entity, but still, it would've been nice here).

I'm also not sure how document sets will work in the archiving situation as document sets can't be declared as in place records, you have to push it into the records centre first and then declare. I wasn't envisaging users having to manually declare an applicant or student file as a record, but what if they do want to? It would mean additional training and guidance at the very least.

As with any new feature, it's always useful to make sure you do a wide review of available 'literature' on it to get a 'warts and all' view of it. Then you must spend time in advance of even double-clicking on your IDE planning through how you can use it to best advantage in your own environment. For me I think the most likely outcome would be three document set types, one related to direct applications, one to undergraduate applications and another to students. It would be possible to break student document sets into course types (i.e. undergrad, postgrad taught, postgrad research) but we get plenty of students who have multiple stays that cut across these boundaries and even within a given stay a student can have both taught and research components to their studies covering multiple independent courses.

Where to go from here?

This is still something of an open question though. Whilst I like very much some of the commonality of content you get with document sets, there are advantages to going 'lightweight' and letting the metadata do all of the talking at the document level, especially when it comes to dicing up those documents for display in a UI.

I already have interfaces setup for managing the relationship between the applicant and student documents an individual has with us, including switching between the different relationships depending upon the role of the end user. My suspicion is that this may not be as easy with document sets (though my gut tells me it's perfectly possible) or that it might not be as intuitive. I'm also concerned that some of the limitations of document sets might appear to be ok right now, but could cause problems further down the line.

If I can get my head around how a single document set could usefully represent the whole life cycle of applicant <-> student <-> alumnus then I'd be making some progress, but I'm not sure they're actually that well suited to that kind of 'object'. They seem specifically targeted at the legal community who like to keep case files and whilst you could treat the life-cycle as a 'case' I don't think that this would necessarily serve the users in the best way.

- rob

Tuesday, 10 January 2012

A way to autogenerate a command line install of SQL 2008 R2

Came across this really handy piece of advice today:

Allows you to very easily construct a valid command line install string for any variation of SQL installation.


- rob

Monday, 9 January 2012

How to build sustainable and supportable workflows for SharePoint

I spend a lot of my time developing workflows and forms (in InfoPath) to replace existing paper-based processes. As such I have been through a lot of learning experiences, some of them rather painful, when it comes to successful implementation. There are plenty of technical articles out there about how to develop a workflow in the SharePoint environment, SharePoint is an extremely well 'socialised' development environment with people giving away superb 'free' advice and code all over the place. Google, of course, is your friend here (or whatever search engine you've sworn undying allegiance to in the battle to win the war of all wars: which free text box we type our search terms into.)

For me, the hallmark of good workflow design allows for traceability, meaning that you can quickly and independently of the SharePoint environment deliver real-time status and progress information on workflow progress; modularity (being able to 'chunk out' workflow sections where you need similar functionality in a range of areas) and sustainability throughout the (re)development process. This last one is critical in the initial stages of workflow release, because it is in the nature of process mapping that intricacies are missed and certain specific variations unaccounted for.

Sustainability. Lets face it, nothing is more annoying (and yet nothing should be less surprising) than crafting a nice neat workflow, having it tested (supposedly) to user satisfaction and then hearing "ah, we forgot about this case, because they hardly ever come up" or "we weren't at this stage of the cycle when we were thinking about this, so we forgot that between January and June we deal with them differently." No matter how good your up front business analysis is, common sense (and human nature) suggests that you will run into these situations and if you haven't developed your workflow with the capacity for it to restart and pick up where it left off, then you will immediately break or damage all existing workflows when you deploy the new version that takes into account the latest remembered foible. Obviously you want to make sure that your process mapping and review methodologies are as robust as possible, but not planning for potential missed cases is asking for trouble. Sure, you could deploy the new version under a different solution package but this then means it must have a different name (you can't have two workflows on a single list/site with the same friendly name), different workflow guid and all sorts of other things that mess with your ability to provide good tracing and performance indication data. I resolve this by ensuring all workflows I develop are 'self-aware' enough that they can tell, from the state of the workflow form, where they need to begin from in the event of being restarted. They also know how to maintain their relationship with the tracing data in the event of a restart such that you don't get 'data gaps' where a restart has taken place. This is one of the key relationships in workflow implementation that is often forgotten: that between form and workflow logic. Form design is seen as important from a UI perspective but it is not necessarily seen as a way of persisting 'dehydrated' wf data sufficient to pick up where we left off if, for whatever reason, a workflow needs to be restarted.

Modularity is hard to deliver in SharePoint workflow development (although you can cut and paste from one designer window to another in VS, I wouldn't recommend it, the order of parts and how various conditional logic works seems to get a bit screwed up and patching back in all of your declarative conditions takes a while). However whilst the actual design surface can be a pain to modularise, there's no excuse for not developing a good 'helper' class for common functionality and ensuring that your business objects are developed to support liaison between them and the workflow layer. If you're keen on delivering high performance KPI-style data to a usable front end, then that means developing a class for handling interactions between your workflow and tracing environments.

Additionally, if you're going to be using the functionality a lot, then developing a custom workflow action, might be the way to go.

One key factor in delivering both restartability and modularity is getting to grips with state machine worklows (20072010). They allow far better flexibility in dealing with 'odd' restart situations and allow you to design a workflow, have it be 'corrected' and not have to completely re-do your process flow logic. Additionally, you can only really do repeating tasks and achieve useful functionality such as workflow suspension (there's a requirement local to me that workflows can be hibernated for a dynamically set period of time from any stage of the workflow) and 'sending back' (users love being able to send something back a stage, or two stages and let someone know they haven't done their 'bit' right) when you delve into state machine workflows. They make each part of a workflow a modular component that can be run through an indefinite number of times and suddenly give you an immense amount of flexibility when it comes to developing truly 'business process like' workflow solutions.

Traceability. So, why send SharePoint hosted workflow data outside of SharePoint for the purpose of delivering information on status and progress of workflows? Surely, I hear you all not asking, that's mad and I should just let people check the status of workflows using native SharePoint data? Well, if you are developing workflows where all of your business data is similarly hosted in SP lists or tied just to SP documents then that's not the worst idea in the world.

There are still some hurdles to overcome when it comes to making sure users can find that data quickly (drilling down to task-level information for a SP workflow takes way too many mouse-clicks and knowledge of where the 'workflows' link resides) and there's some help on delivering that already available out there on various blogs (this blog uses jQuery to start a workflow 'on page', the same approach can grab more detailed information on workflows and display it 'in line' with other data).

When it comes to delivering what I'd call 'real world' workflow solutions, you often want to combine SharePoint workflow data with LOB data and then surface it not only in your SharePoint environment, but probably elsewhere as well. In my case a lot of the data needs to be combined and presented within SharePoint as part of a LOB application, but also within SharePoint as part of our Reporting Services infrastructure. Now, I know it's technically possible to use SP web services with SSRS to produce reports that draw from both SQL LOB data and SP wf data, but I'll tell you right now that from both a performance and support standpoint it certainly isn't desirable. Additionally (and this is a gotcha that appears to have been cleared up in SP2010, though not really to satisfaction) you have the issue of longevity of SP workflow related data. SP likes to clean up after itself and even where data does persist, it is not in a form that any normal end user would comprehend. SP lists (to a lesser extent in v2010) also have certain performance issues relating to large numbers of items being returned in a single query and, over time, all workflows would likely run into them (unless they're exceptionally low volume).

As such I persist a set of workflow related status data in SQL and log workflow progress as it goes along and then use this to tie together workflow progress and other LOB data to produce usable reports, KPIs, and tracing information as part of the LOB applications I deliver entirely within SharePoint. I can then also use this same data to expose information relating to SP-based workflows and processes to non-SP environments more straightforwardly than it might be attempting to get those other application environments to "play nice" with SP.

So, what does this mean?

How this pans out in practical implementation and development terms is something I'll cover in a future post. The key messages I'm trying to get across here is that sometimes you need to think beyond SharePoint to deliver truly useful SP-based applications and that, especially with workflow development, you need to approach your implementation in a way that recognises some of the shortcomings of the SharePoint workflow deployment model, such that you don't have to reinvent the wheel just because an outlying case rears its head three months after full release.

- rob

Monday, 2 January 2012

When upgrades go bad... always have a backup plan!

So, the upgrade to R2 didn't exactly go according to plan. It transpired that the server had some invalid SSL certs installed on it and the upgrade wizard threw an error before continuing with the process.

Unfortunately the service then wouldn't start again throwing errors about reserved URLs and not being able to load the default AppDomain. Nothing obvious to resolve and a repair to the installation threw similar errors.

I already knew at that point that I had a rescue point if the upgrade didn't go well. A new Reporting Services installation will happily be configured to talk to a pre-existing Reporting Services database and that was always my fall-back position - to uninstall the instance and reinstall.

Once the reinstall had completed, I had to reconfigure the ports appropriately on the instance so that the SharePoint farm knew where to find it and then had to redeploy the datasources (unfortunately whilst I'd backed up the encryption key pairing I'd neglected to note the password, so I had to delete the key and recreate it, which removes all encrypted content from the RS database, including data sources).

Data sources redeployed reports all began to function again and all was once again well with the world.

This whole process has reminded me of why I hate upgrade processes. Of course going back over things now, I see the guidance MS put out about SSL certificates, so you could quite easily accuse me of failing to do due diligence to the pre-upgrade checks recommended by MS, but given that those checks also include the oh so excellent advice to "carry out the upgrade on a test farm with an identical configuration" (oh sure, right, we all have an identical SP farm and RS setup laying about!) it's not surprising that sometimes you miss something.

My advice: if you can, treat upgrades as a migration, especially if anyone washes their hands of any responsibility regarding likely success by suggesting you should do the upgrade before you do the upgrade...

- rob