Tag Archives: Planning and Architecture

Active Directory Images are not imported by the SP2010 User Profile Import

This is something that I found when starting a new SharePoint 2010 (SP2010) Intranet project. You see, my client has quite a simple requirement, at least it seems that way at first.

They have an HR database which they use to store everything from employee details to security card information. For this reason they manage all of the employee photos here, and they are pushed into Active Directory (using the LDAP “jpegPhoto” attribute) to make them available in applications like Outlook.

So … to put it quite simply .. they don’t want to use SharePoint to upload their photos. In fact, they already have their photos in AD .. they just want to pull them into SharePoint 2010.

Can’t we just map the Profile Property to AD and Import the value?
So here is where we hit the roadblock.. The Active Directory “jpegPhoto” attribute is of type “Binary Data” and the SharePoint 2010 User Profile property is of type HyperLink (as it typically links to an image in an Asset Library in the My Site Host site collection).

As a result, you cannot import it using SharePoint 2010 functionality (although it may be possible if you have also purchased the more sophisticated “ForeFront Identity Manager Synchronisation Server” product).

Options ??
Well .. SharePoint IS used as a development platform so there are some options. Obviously there is a full API for reading and writing in and out of the User Profile database (some more accessible than others).

There are a few good blog articles that you can follow if you want to build your own import function. If you are happy to wait a while longer then I already have my own solution which I will be posting up with the following features:

UPDATE – Full Source code and WSP now published

* WSP Package
* Farm scoped feature, which installs a Timer Job attached to the MySite web application
* Iterates through all User Profiles, and finds and extracts Binary image data from Active Directory
* Automatically creates thumbnailed images in the My Site Host Asset Library
* Automatically updates User Profiles to point to those new images

During my research Glyn Clough also pointed me in the direction of another solution which allows you to do the same using a Console Application… personally I prefer the more “SharePoint” route of Timer Job and WSP package 🙂

In the meantime, you can feel free to avail yourselves of these posts:

Load Testing SharePoint 2010 with Visual Studio Team Test

 

So exactly what do we mean by "load testing" when it comes to SharePoint 2010? There are lots of methods that people tend to point towards, and I’ve heard "hits/visits per day" and "throughput" bandied about, but at the end of the day it comes down to 2 things:

 

  1. Requests Per Second

The requests per second literally means how many requests for information each server is capable of responding to per second. Each page may consist of dozens of artifacts, and for each artifact the browser needs to make a "request", therefore the more of these  "requests" it can serve the better.

 

  1. Server Response Time.

The response time represents any processing on the server side (or TTLB – Time to Last Byte). This doesn’t factor in network latency or bandwidth though!

 

So the first thing you should think about is what can influence those metrics? And you end up with 5 different elements of your SharePoint 2010 farm:

  • WFE
  • Storage
  • Network
  • App Servers
  • SQL

 

This, as I’m sure you can imagine, can involve a LOT of testing. Simply testing the WFE on their own is going to be struggle for your average developer, and if you don’t have any industry testing experience you are going to have a hard time, but this is where the new SharePoint 2010 wave continues to make it’s presence felt. ..

 

SharePoint 2010 Load Testing Toolkit

This is a new set of tools being released with the SharePoint 2010 Administration Toolkit and represents the easiest possible way of load testing your SharePoint environment. The main objective here is to:

 

  • Standardise and simplify the cost of load testing.
  • Simulate common SharePoint operations
  • Be used as reference to create other custom tests (for custom code, for example!)

 

The whole thing relies on the IIS analysis logs. These logs give pointers on where users are going, what kinds of requests they are doing (GET / PUT) as well as the types of files they are typically accessing (ASPX / CSS / JS / JPEG / DOCX / etc…)

 

The Load Testing Toolkit will analyse your IIS logs and automatically generate a set of loads tests to appropriately match your environment, producing automated scripts that can be run in Visual Studio (either Team System or Team Test Edition).

 

How hard can it be?

It is really quite simple (well, according to the ridiculously simple explanation at the SharePoint 2009 conference!). You literally point the tool at your IIS logs, and it spits out an entire suite of tests, for WFE, SQL, Storage, etc .. Including all the metrics you would want (from CPU, RAM, Network, Disk I/O and even SQL , ASP.Net and .Net Framework specific performance counters).

 

Then you just run it and analyse the results!

 

Analyse That!

The analysis couldn’t be simpler. With "Requests Per Second" and "Response Times" two of the metrics generated by the Visual Studio test reports, you really can’t go far wrong.

 

If you do find a problem, then you can delve into the new SharePoint 2010 "Usage Database" (which now runs on SQL Server) in order to identify exactly what was causing your dip in performance (say when someone deletes a large list?).

 

Tips and Tricks

There are a few gotchas, one thing is to be careful of "Validation Rules" in Visual Studio. Typically it will be happy with pages that return "200" codes. This of course includes Error and Access Denied pages (which SharePoint will handle, and returns a perfectly valid page (hence the 200 code!)).

 

It is also recommended that you let your test "Warm up" for around an hour before you start taking the results seriously.  This allows all of the operations, timers and back-end mechanics of SharePoint to properly settle down, and means you are getting a realistic experience of what the environment will react like once it is bedded into it’s production environment.

 

Finally, the SharePoint Usage Logging Database is a great location to grab information out of, so why not leverage other great aspects of the Office 2010 family. You could pull through the Usage DB information into Excel 2010 (perhaps using PowerPivot?) so that you can spin out charts and pivot tables to easily drill down into your data.

 

Typically load testing tells you WHEN bottlenecks are occurring, but the Usage Database can tell you WHAT is causing the bottlenecks!

SharePoint 2010: Architecture Guidance – things everyone should know!

Well, the final day of the conference came and with it some of the most useful sessions (from my perspective). One of which was the "Architecture Guidance for SharePoint 2010". This hopefully distils some of that information. It’s not a be all and end all, but hopefully points you in the right direction so that you can focus your research a little better!

 

[UPDATED: 27/10/2009 16:09]

 

UI Design

  • Entire interface in SharePoint 2010 to be W3C XHTML compliant
  • SharePoint 2010 "more accessible mode" to be WCAG 2.0 AA compliant
  • New ribbon interface replaces toolbars and menus (and considerations for old "CustomAction" commands which may no longer work!)
  • Wiki content allows web parts to be dropped in (removing over-reliance on web part zones)

 

Lists

There are a whole load of new List capabilities (in addition to the "External List" that BSC brings to the plate!).

  • Lookup to Multiple

This means that when you create a new lookup column, you can now pull down additional fields from the lookup list item and use them for filtering.

  • CAML support for Joins!

You can now perform "JOIN" operations in your CAML queries for linking lists together.

  • Enforced List Relationships

You can now enforce specific relationships for lookup columns with two options:

  • Restrict Delete – cannot delete parent if child items exist.
  • Cascade Delete – If you delete the parent, all child items are automatically deleted (recycle bin aware with "restore" options!)
  • Store-level enforcement

This is code level "required fields", so now you can enforce the requirements even through code !

  • Unique Fields

Specify a unique field, so that no two values can match (e.g. Email addresses in contacts list)

  • Compound Indices

If you want to query by 2 fields, you can now index both at once as a compound index.

  • <In> clause for reverse lookups

This allows a CAML query to do a reverse lookup to get all child items that are associated with the parent!

  • Formula based validation

e.g. Don’t allow Field2 to be lower than Field1.

 

Workflows

  • Out of the box SharePoint 2010 workflows can now be extended in SharePoint Designer 2010.
  • SharePoint Designer 2010 can be used to create "re-usable" workflows
  • Site Workflows – to manage processes across an entire site.
  • You can now import a SharePoint Designer 2010 workflow into Visual Studio 2010!
  • Import/Export workflow using Visio 2010 for visual workflow modelling.

 

Content & Document Management

  • "Document Sets" allow you to treat a group of documents as a single item (with 1 version history, group executed workflow and policy, and a "download as zip" option).
  • Managed Metadata Service  allows cross-farm Content Type management and a pre-defined enterprise taxonomy structure! This is a killer-app, bringing true enterprise content management to SharePoint 2010.
  • Enterprise Wiki’s allow more rapid "in edit" content, as well as Web Parts deployed directly into the rich text editor (no more web part zones?).
  • Spelling check and broken link check when you "check-in" WCM pages.

 

Event Handlers

Three new event handlers added (at last!!)

  • WebAdded – Fired every time a child site is created in the web.
  • ListAdded – Fired every time a list is created in the web.
  • Feature Upgrading  – Fired when a feature has it’s "upgrade" method called (more on this in a future blog post).

 

Security

  • Editing of ASPX pages now required "Designer" permissions (instead of contribute).
  • XSS (Cross Site Scripting) protection for pages and web parts.
  • HTML pages will now "force download" by default. This stops people from uploading HTML files with malicious scripts, so if you click on an HTML file in a document library you will get a download dialog instead of the file opening in the browser!
  • There are still no field level permissions (it was estimated that this would add a 30% overhead to performance! Maybe in a future release)

 

BI and Connectivity

  • New Business Connectivity Services (BCS) allows no-code connections of databases and LOB systems to content types and lists with two-way synchronisation of data  and full CRUD support.
  • BCS interactivity from within Office clients, allowing LOB system data to be edited directly from desktop applications (such as Outlook and Word).
  • PowerPivot for Excel allows upwards of 100 million rows into an excel workbook with phenominal performance.

 

Office Application Support

  • New web level services for applications (Excel / Visio with JavaScript events!)
  • SharePoint Workspace to replace "Groove" for offline file support and editing.
  • Office Web Applications to allow for direct opening and editing of documents from within the browser!
  • InfoPath 2010 can now be used to edit the List forms out of the box!

 

Databases

  • Still a 100GB "limit" for content databases.
  • Still cannot have site collections spanning multiple databases.
  • New support for "Failover" databases, SharePoint 2010 is now SQL mirror aware!
  • All "Service Applications" have their own SQL database, along with many other new databases (e.g. Feed Activity, Social Data, Usage Logs).
  • New "read only content databases" open the door for simple content deployment (utilising SQL log shipping or database replication).

 

Content Deployment

  • All execution now in Timer Jobs.
  • Performance (and memory usage) improved.
  • Export routine now creates database snapshot to improve data integrity!

 

Sandboxed Solutions

  • Ability to upload WSPs directly into the content database to execute in minimal permissions using "virtual files" (no impact on the file system!)
  • Resource throttling, code performance checking and "bad routine" blocking
  • Provides new best practice for code development and deployment!

 

Search

  • New FAST search with thumbnail views (and navigation!) for office documents
  • Improved relevancy and non-query searching
  • 2 new search products (FAST based)
  • New refinement panel for advanced sorting and filtering "on the fly"
  • Multi-lingual support with over 80 languages built-in.

 

Social Networking

  • New My Sites structure
  • Activity Feeds to provide updates on user activity with an extensible architecture!
  • "Social Feedback" functions akin to Delicious and Digg allowing tagging of any URL based content, and subsequent discussions around items that have been "tagged".
  • Ratings mechanism distributed throughout the product.

 

I’m sure there are many other things, so please let me know if there’s anything else you think should "make the grade" and I’ll see if I can add it in 🙂

Social Feedback and Activity in SharePoint 2010 – Ratings, Tags and Notes

The social functionality in SharePoint 2010 has been massively improved from the previous versions of SharePoint, and one of the areas is around the concept of Social Feedback.
 
Question: How many times have you found a useful link somewhere on the internet, but had no way to usefull record that and get feedback from your colleagues?
 
Well, SharePoint 2010 social feedback can help with this, you can now "tag" any source on the internet (or intranet) which has a URL. This is stored in your "tags" section on your My Site, and also appears in your "Activity Feed" (which is one of the new areas in the SharePoint 2010 My Site).
 
Other users can also post "notes" relating to your tag, which effectively creates a discussion board around the "tagging" activity, allowing conversations around something that has been tagged.
 
Now, one of the key points is Security Trimming. Lets take this example: what happens if you Tag a document that someone else doesn’t have access to?
 
The good news is that social tagging uses the Search Index to provide security trimming on content that is stored in SharePoint.
 
This provides the capability for senior managers to tag confidential documents (and hold conversations about that using notes) but those tags (and notes) are not visible to anyone who doesn’t have read-access to the document!
 
On top of this is included a Ratings feature, where you can rate content within SharePoint lists (finally, the death of third party "rate my content" web parts).
 
This means that SharePoint 2010 now has similar social feedback functionality as other products like Digg or Delicious, in that you can tag and rate content, and other people can interact with that "tag" creating a discussion.
 
Architecture
All of the Social Feedback information in SharePoint 2010 is stored in a separate "Social Database". This sits alongside the Profile Database.
 
There are then "Gatherers" (Timer Jobs) which will collect all of the changes to both the Social Database and the Profile Database and this is stored in another database for Activity Feeds (the Activity Feed Database) with foreign key pointers back to the Profile Database (so you know who’s activity it is).
 
The performance is impressive, aiming for 2000 requests per second, and in terms of storage they are looking to support over 600,000,000 rows of data! They claim that this is sufficient for activity (including social feedback) for 400,000 users over 5 years!
 
Extensibility
You can also hook into this process yourself. You can build your own "Gatherer" jobs to collect information from any data source that you like.
 
A good example is a CRM database, so that you can show activity in CRM in the My Site Activity Feed, showing when people schedule meetings or achieve sales activites.
 
 
All in all the Social Feedback and Activity in SharePoint 2010 is shaping up very nicely. The performance is something that they are still working on, so don’t expect amazing results in the Beta version, but Microsoft are already using this for all of their employees so the dogfooding will make sure that this is given all the attention that it needs!

Topology Changes for SharePoint 2010 Logical Architecture

The SharePoint 2010 topology has been massively updated, allowing for greater flexibility and scalability than ever before.

 

The "Shared Service Provider" is dead, it doesn’t exist in SharePoint 2010 and instead is replaced with new "Shared Service Applications". This allows core services to have their own security settings, run in their own applications and on their own databases.

 

There is even support for "cross farm" Service Applications (such as Search, User Profiles and the Managed Metadata Service) to allow distributed farm architecture like never before. Now in SharePoint 2010 you can scale up into multiple farm environments, allowing you to take advantage of more geo-distribution flexibility, and greater performance and availability from having dedicated farm hardware for important applications.

 

For the larger enterprise environments you have the benefit that different farms provide the opportunity to service different SLA requirements, and the Many – Many relationship for Web Applications to Shared Service Applications means that core enterprise level services can be shared globally, but smaller core specific services can be hosted multiple times, closer to the client environments, to service  those farms that need them.

 

If you need greater security boundaries and better utilisation of resources you can spin up department specific farms for business critical organisational boundaries (such as HR and Finance) each with their own independent services or shared services (such as an HR specific BCS, or Finance and HR sharing their own  set of Managed Metadata for payroll and accounting data, a service that is not provided to the more generalised collaboration and publishing environments).

 

All of this comes together with other administrative changes (such as the SQL failover awareness and Managed Accounts) to make SharePoint 2010 a truly industry leading platform for web applications and technology. I cannot think of any other product on the market that offers this level of flexibility across so many different technology streams.

How to handle document retention and expiry in MOSS 2007 – Disposition Workflow and Expiration Policy

This is rapidly becoming a hot topic in Records Management with SharePoint, as organisations put increasing amounts of information into their SharePoint environments serious thought needs to be put into how that information should expire and (more importantly) what happens when it expires.

SharePoint has several solutions to these issues which, if configured correctly, can help pave the way to successful document expiration management.

The Expiration Management Policy
Information Management Policies are a new framework introduced in MOSS 2007, and allow specific "policies" to be applied system wide using Content Types, or to individual document libraries and lists.

One of the most popular policies in information management is the Audit Policy (which is vastly superior to version history, as it can track who has viewed, downloaded or deleted a document, as well as who has made changes), but another equally important policy is the Expiration Policy.

The Expiration Policy effectively allows you to specify the retention period for content. This is generally calculated from the created or modified date, although you can specify any formula (i.e. calculated fields) based on any date/time column value.

When a document "expires" you can then select from a number of actions: delete the item, perform another custom action (extensible through development) or start a workflow, and it is the latter that we will focus on here.

As mentioned, you can develop additional "custom actions" for the expiry Information Policy, which you can deploy as a Feature (there is a good article here and also a forum post discussing the options for this).

The Disposition Approval Workflow
The disposition approval workflow is designed specifically for document expiry, and has been designed with a very simple user interface.

When started the workflow creates a task, which is linked to the document. The task presents the user with the options of deleting the item, or keeping it, and the ability to add some comments. Completing the task will perform the appropriate actions on the server.

This basically gives you "out of the box" capability to have items which expire and then get (optionally) deleted upon expiry.

..

Now, I put my developer hat on and delved behind the scenes.

The Disposition Approval Workflow consists of a custom Task Content Type which is used for the task, and an InfoPath form which is used for the task edit form (via Form Services).

This gives us 2 options for replacing / extending the Workflow.

1) Create our own custom content type for the tasks. This allows us to attach event handlers and perform any custom operations when the tasks are created / edited / completed.

2) Create a replacement InfoPath form and use that for the task completion, perhaps with additional options (such as "archive" ?).

Either way, the out of the box options are quite extensive, and the Workflow / Content Type structure gives enough extensibility to provide almost any functionality for document retention.

The Key to successful SharePoint Projects

I got 3 words for you …

Plan
Plan
then Plan some more …

Ok .. so this isn’t specific for SharePoint, all Software development projects are reliant on planning for successful execution (how can you know if you have delivered, if you don’t know exactly what you are supposed to be delivering?) but SharePoint has this point in spades.

The first reason is Scaleability. SharePoint scales extremely well (up to 40TB in Lab Conditions according to the Microsoft SharePoint Team Blog). In fact there is an excellent Whitepaper recently available on this topic: SharePoint Server 2007 Scalability and Performance whitepaper.
However … this level of Scaleability can only come if it has been planned for. You cannot simply install SharePoint "OotB" (Out of the Box) and expect it all to work.

The main effort required in the Scaleability process is the idea of "Site Collections", each of which is reported to scale easily up to about 50GB (although the labs mentioned above had >100GB Site Collections). Either way, if you are looking at an enterprise level implementation you are probably going to run out of space at some point and that will cause problems.

So .. you need to factor in Site Collections to your infrastructure. I won’t spend ages talking about them, but there is a great article by Hiran Salvi on Sites vs Site Collections that is a must read on the topic.

The other main factor is Content & Metadata Maintenance (also sometimes referred to as "Metadata Governance"). If you have been working with SharePoint for a while then you are bound to come across a client who asks the following:
 "Can I add a column to all of my Document Libraries?"

Now … if you haven’t planned for Content Types then this could be a really big headache! Content Types are effectively the schemas that define all of your SharePoint storage. Each time you create a Document Library it is using the OotB "Document" Content Type. Now .. you could go and just modify the Library itself, but then you are effectively breaking your Content Type structure and that is bad mojo! (what happens when you want to modify said column after you’ve added it to 50 libraries?? you’ll need more than one cup of coffee for that job!). The main problem is that you cannot modify the "Document" content type either! (well .. you could … but you will probably break something doing it!). The reason is that a lot of other places in SharePoint also use that Content Type (such as "Master Page Gallery" and "Page Libraries" to name but a few!). A second major problem is that you cannot easily modify all of the Document Libraries either! (I mean .. thats what Content Types are for .. right?)

The recommended practice is to create a custom Content Type for all of the user-created Document Libraries. If the client wants to add a column .. no problem, you just add the column to the new "custom" content type, and Robert’s your father’s brother!

Of course .. if you didn’t plan for that in the beginning … well .. you’ll be in for some pain! You either have to pay a temp to sit on your server for days on end re-allocating all of the Content Types system wide, and then you still haven’t solved the problem of new lists and libraries that keep getting created! Or you have to get a developer to spend some very expensive days writing code to go around and do it all automatically!

.. either way .. best to be avoided and will be far cheaper in the long run if you plan for it!

The final point on planning (and by no means the final word on the topic, just the final point in my ramblings) is the following:
Who is going to use the system? And how will they use it?

Far too many people assume that SharePoint, being a Microsoft product .. hmph .. will just work out of the box and will solve all their problems (including World Peace). The truth is .. SharePoint is an advanced and very very complex web development platform.
Yes, it provides a whole raft of management, maintenance and administration functions and offer variability and flexibility like few other web based systems before it! But it requires time, a LOT of thought and plenty of expertise to deploy correctly.
The most important part of this is the users. What business requirement is being satisfied? What is the reason for using SharePoint (or any web based system?), what are the problems and how has this implementation been designed to solve those problems?

Last but not least .. never never never neglect the end users, especially the Content Managers. I have seen 2 things commonly happen which cause projects to fail:
1) "We’ll just give them all Team Sites"…
This is a classic. You create 500 Team Sites … one for each department / user group / business stream / office / project.
There is one major problem. they are ALL EMPTY!
99% of users will load up their nice bright shiny portal … and look at these nice big expensive blank pages. Then log off and probably never return.
If you don’t give them content and relevant design then you won’t get any buy in!

2) "We’ll let the Site Owners do what they want …"
This is probably worse that point (1). At least there you can rebuild and you don’t have to worry about the current system (because it’s pretty much empty .. right?).
Now don’t get me wrong … this technically CAN work .. but you need Effective Training to make sure that your Site Owners understand the followig ideas:
* How to manage content
* How to create a hierarchical structure that works
* How to manage security permissions
* What is vision for the system as a whole?
* What do we want people to use the system for?

Without a unified vision for the system you end up with an implementation that is haphazardly put together, where each site has it’s own layout, classifications and web parts. Trying to central manage a system like that will be a nightmare .. for both the project leaders and the systems administrators!

Well .. thats my ramblings for now, hope some of these nuggets have sunk in.

As always … comments welcome!

– Martin Hatch