Category Archives: .NET

ASP.NET MVC applications: from classic example to modern-day architecture

It is already more than ten years ago since Microsoft released ASP.NET MVC as an alternative to ASP.NET Webforms[i]. Originally intended to make the transition from desktop windows to web applications easier for developers, Webforms with its viewstates and events was often seen as a forced, non-web way of building web applications which hided the true nature of web development.

In 2007 after years of community debate ASP.NET MVC was announced as the new way for Microsoft’s web application development[ii]. However the MVC design pattern from which it derives its name is one of the oldest software design patterns around, dating back to Smalltalk in the late 1970s and early 1980s.

The Model-View-Controller (MVC) pattern

The MVC pattern can be used where data needs to be presented in some form to a user or external system. This is usually by presenting some marked-up presentation on a screen, although other forms (like JSON output of a Web API service) can be seen as presentation as well. The MVC pattern emerged as result of the object-oriented principle of separation of concerns:

  • The Model, containing data to be presented and possibly notifies the view of state changes.
  • The View, presentation logic for the data possibly containing markup and condition scripts as well as providing ways to have user input sent back to the controller.
  • The Controller, sending the view to the rendering device and handling (user) communications from the view to the model and the rest of the application.

Diagrams vary a bit but usually the MVC pattern is something like this:

Although the pattern as shown in the diagram is still valid, in the one decade of existence of ASP.NET MVC the world changed dramatically. Applications became much more complex and distributed with big data, microservices, links to various external systems, security and privacy demands and mobile and cloud-based platforms. Modern business and project delivery methods like Scrum and DevOps demanding flexible and highly testable solutions. It means the three parts of the pattern tend to become too complex and SOLID[iii] principles demand more separation of logic.

Classic ASP.NET MVC implementation

In this article I assume the reader is familiar with at least the basics of ASP.NET MVC. When creating an ASP.NET MVC application in a .NET development environment, the classic MVC structure is quite apparent in the folder structure of the (Visual Studio) project:


Project
  |- Controllers
  |- Models
  |- Views

There will be some more folders for various things but the MVC structure is clearly visible. The Views folder will contain the razor views (.cshtml files)  in folders according to naming convention, and the Models folder will actually contain ViewModels, but in simple (CRUD[iv]) applications and examples quite often the model classes in here would reflect their data source directly.

In the traditional way the controller classes often contained one or more GET methods, possibly with a parameter to get a specific model instance to return to a view, and some POST methods through which a model instance could be created or updated. Quite often the controller connected with the datasource directly, creating some connection context either within the methods or on initialization of the controller class. So generally a controller could have a code structure like this:


public class ProductController : Controller
{
    SomeDbContext db=new SomeDbContext();   // The database context
      
    Public ProductController()
    {
        // Code to initialize further the database context if needed
    } 

    public ActionResult Index()
    {
        // …
        // Use the database context to get a list of products and create a list of “ProductModel” items named products to return to the “Index.cshtml” view
        // …

        return View(products);
    }

    public ActionResult Products(int id)
    {
        // …
        // If id is 0 redirect to Index, otherwise get the product with the specific Id from the database context and return to the “Product.cshtml” view
        // …

        return View(product);
    }

    [HttpPost]
    Public ActionResult Create(Product product)
    {
        //…
        // Code to create a new product in the database
        // …
        return RedirectToAction(“Index”);
    }

    [HttpPost]
    Public ActionResult Update(Product product)
    {
        //…
        // Code to update a product in the database
        // …
        return RedirectToAction(“Index”);
    }
}

This is pretty much a controller that can handle giving an overview of products and do CRUD operations on products. If the logic between retrieving data and sending it to the view gets more complex it can be delegated to private functions on the controller or separate business logic classes.

Model classes could be generated and the data accessed from a database using a framework like LINQ2SQL. In the traditional object-oriented way, classes should encapsulate data and functionality, so any functions could be added directly on the class or using the “partial” class construct Microsoft had invented to deal with extending auto-generated classes [v].

In many tutorials and older examples this is the general setup shown for an ASP.NET MVC application, and for a simple CRUD applications this can still be fine.

ASP.NET MVC in modern software

Nowadays software tends to be much more complex than the traditional 3-tier approach of web applications which mainly consists of the application’s data layer (often a relational database), business logic layer and presentation layer. The above approach has several disadvantages:

  • The use of the database context as a private field in the controller causes a tight coupling between the two, making (unit) testing more complex and time consuming.
  • When complexity increases, the controller classes lose focus and violate more the SOLID and DRY principles of good object oriented design. The purpose of a controller class should be mediating between view,  (user) interaction and the underlying application.
  • Using data models in the view can give unnecessary or even unwanted access to data fields and/ or functions.
  • Models can get complex with added functions and dependencies. They are not focused on their primary role, which is holding and transferring data within and between systems.
  • With the increase of data volumes and distribution of data in many locations, it is desirable to keep data requests and data transfers (and therefore data models) as compact as possible since sending redundant data over networks and the internet can decrease performance dramatically.
  • Quite often modern software systems need to incorporate and communicate with third party components and services, for example federated authentication systems and payments providers. Usually developers have no control on how and in what format these third party systems deliver their data, causing a need for transformations and extra checks in the application.
  • Business demands and DevOps practices require fast and frequent updates of software parts. Therefore the less dependencies between components and classes, the better.

Removing the controller dependencies on datasource contexts

If we want to create automated (unit) tests, the first problem to overcome is the tight coupling between the controller and the database context. This can be done by either using reflection or some other bypass to replace the database context with a mock object on test initialization, or by using a real database for testing.

Especially the second option causes a lot of overhead for initialization before and cleanup after each test. On top of that the communication with the database will severely slow down unit tests which can be unacceptable in a DevOps environment. The first option will not always be possible since the data source may already require configuration or a valid connection when the controller instance is created.

Data for a controller can come from multiple sources and may need structuring, filtering or transforming. There is a tendency towards using web service and REST protocols for communication with data sources because of distribution and scalability. A more general term “repository” has emerged to indicate the various forms of data storage and services.

In MVC applications, we create a “Repositories” folder and in there a repository class for each(!) data source. In our example we can create a class “SqlDbRepository” to where we move the SomeDbContext and any logic involving it’s initialization and data manipulation.

Since we are implementing repository classes ourselves, we have full control on how and where the datasources are initialized and approached. We also cleaned up our controllers by moving code related to context initialization and data handling to the repository classes. By creating interfaces for the repository classes and use them in the controller, we have made our controller independent of a datasource context or client implementation and we can finds ways to create mock implementations in unit tests without much effort.

Using the right models in the right place

More complex software means quite often a view needs to combine data from various sources. Therefore models used by views may differ greatly from the models used for retrieving and transferring data. For example think of an invoice view which may need to combine data coming from a CRM system, financial system and a postal code checking system. On top of that we may have little control on format and content of the data delivered so we may need to perform transformations before using.

So the model classes in our MVC Models folder should be viewmodels tailored to the views that will use them, and the data for them should be transferred from datamodels or business logic models specific for incoming data transfers or processes. Quite often I see code for these transfers spread across a project in controllers or on model classes themselves, usually looking something like this:


var personViewModel= new PersonViewModel(PersonData persondata)
{
    Firstname=personData.Firstname,
    Lastname=personData.Prefix + personData.Surname,
    BirthDate=PersonData.BirthDate,
    Address=personData.Street + personData.HouseNumber,
    …
}

From experience I know programming the logic for this can be tedious and time consuming.

It makes sense to delegate the operations for this data mapping to separate mapper classes, and put these in a separate “Mappers” folder in our project structure. Tools like Automapper (https://automapper.org/) can be a great help to reduce the code that needs to be written for this. However in high performance applications the hardcoded approach may still be favourable since these mappers can come with a small performance hit.

It is recommended to create a mapper class per target type (viewmodel). This class gets (usually static) operations taking one or more source objects. For naming convention give the class a name “MapToTargetType”, and implement operations as From(SourceType source).

A small side note: although it would make sense to keep different types of models in different folders (i.e. “ViewModels”, “DataModels”, I haven’t seen this much on real projects yet. The folder “Models” that is generated by default somehow tends to end up the place for all model classes in a project.

Constants and enumerations

Although constants and enumerations can be defined anywhere, the danger of having them in random places is developers can overlook them when they need them, and create duplicate definitions in a project. I’ve seen quite a few projects where constants and enumerations where defined in several places in the code base. Quite often the duplicates tend to differ slightly from each other, introducing bugs in the system when code is altered.

Therefore it is not a bad practice to keep these in a separate folder named “Constants” in the root structure. Then when code is altered or added and a developer needs a constant or enumeration, it is quite easy to look if it has already been defined.

Orchestrating the parts

So we have data coming in from multiple sources through our repositories and perform data mapping to our viewmodels through mapper classes. Maybe we need to do some checking or validation or other extra work. If we need to combine data in our viewmodel from different data sources we cannot do this in a repository class since these classes need to be dependent on one repository each.

This can still result in quite some complex code or unwanted dependencies in our controllers. For this I tend to create specialized service classes, where I put in this logic. Although there is no naming convention for them I usually call them “ServiceClass” preceded by the model or controller type name (i.e. “ProductServiceClass” ). References to (interfaces for) repositories are moved to these service classes, and a controller just gets a reference to a service class and calls a method on it to retrieve the viewmodel. All logic to create a controller’s viewmodel and which transcends the scope of repositories, mappers or other classes is placed in the service class.

If logic in a service class gets complex design principles (SOLID, DRY) can require a more complex structure. In that case the service class may be a façade pattern using other business logic classes in the system.

Using service classes also helps reducing code duplication in case multiple controllers use (part of) the same data and repositories. Complex logic that applies to one model class is moved from the model class to the service class too, so we get clean models with little overhead or clutter.

The new controller code

By now we should have a controller with little code in each operation, like below:


public class ProductController : Controller
{
    IProductServiceClass service=new ProductServiceClass ();       

    public ActionResult Index()
    {
        List products=service.GetList();
        return View(products);
    }

    public ActionResult Products(int id)
    {
        if(id==null)
        {
            return RedirectToAction(“Index”);
        }
        Product product=service.Get (id);
        return View(product);
    }

    [HttpPost]
    Public ActionResult Create(Product product)
    {
        bool success=service.Create (product);
        // Here can go logic to deal with failed creation
        return RedirectToAction(“Index”);
    }

    [HttpPost]
    Public ActionResult Update(Product product)
    {   
        bool success=service.Update (product);
        // Here can go logic to deal with failed creation
        return RedirectToAction(“Index”);
    }
}

As you can see we now have a pretty clean controller which only contains code to deal with the interaction between view, user and the underlying system. Any controller will look like this making it fast and easy to create new views and controllers. By using generics and derive the serviceclasses from generic interfaces it is possible to make a base controller class containing the above logic, and create specific controllers by just deriving from this base class with the specific types for viewmodel and service.

Using dependency injection to remove hardcoded dependencies

Although we made our code quite SOLID and DRY with the above we still have a hardcoded dependency with a service class, and through that with the underlying repositories and other classes. Controllers in ASP.NET MVC are instanced by default from the .NET MVC framework on requests, and therefore we need a way to insert a dependency on runtime.

The last several years we have seen the rise of so-called Inversion of Control (IoC) en Dependency Injection (DI) patterns in ASP.NET MVC applications. .NET Core has native libraries for this in the Microsoft.Extensions.DependencyInjection namespace. For standard ASP.NET MVC there are several third-party frameworks that implement these patterns like Ninject, Autofac and Unity.

Basically these all work the same. By adding one of these frameworks to our ASP.NET MVC application, we get the possibility to pass a dependency through a controller’s constructor instead of making a hardcoded reference in the controller. So instead of:


public class ProductController : Controller
{
    IProductServiceClass service=new ProductServiceClass (); 

    …   
}

We can do:


public class ProductController : Controller
{
    IProductServiceClass service;

    Public ProductController(IProductServiceClass injectedService)
    {
        Service= injectedService;
    }

    …   
}

Since the controller now doesn’t have hardcoded instancing of the service class, we can do the same again in our service class: instead of hardcoded creating instances of repositories, we can pass them on in the constructor of our service class by means of interface parameters. The DI framework takes care of passing on the concrete implementations and calling the right constructor.

Of course the DI framework needs to know in advance which implementation to link to which interface. This is done by registering them on application initialization and keep them in a context called a IoC container. Generally the application’s startup code calls a configuration method on a class like RegisterDependencies which contains code like:


services.AddTransient<IOperationTransient, Operation>();

By using dependency injection we have removed the dependencies of classes and components on each other, making it much easier to swap out or change individual components of a software system. Quite often it is possible to also control the scope and lifetime of instances through the IoC container (i.e. operations like AddScoped and AddSingleton). Also the benefits for (unit) testing are clear: we can easily create mock- or alternative implementations in a testing environment by implementing the parameter interfaces and pass them to the constructors.

The new ASP.NET MVC application structure

So for our ASP.NET MVC application, after the above our folder structure could look like this:


Project
  |- Constants
  |- Controllers
  |- Mappers
  |- Models
       |- ViewModels
       |- DataModels
  |- Repositories
  |- ServiceClasses
  |- Views

We will have added some DI framework and a class like RegisterDependencies in our project root.

Of course the above is not a single perfect solution for every project, but in my opinion a modern ASP.NET MVC application is much more than just some models, controllers and views. Too often I see projects with code all over the place, quite often with random “helper” classes in cases where developers ran into issues with too complex code or redundancy. Hopefully this article helps.

 

Sources:
[i] https://www.dotnettricks.com/learn/mvc/a-brief-history-of-aspnet-mvc-framework
[ii] https://weblogs.asp.net/scottgu/asp-net-mvc-framework
[iii] https://en.wikipedia.org/wiki/SOLID
[iv] CRUD: Create, Retrieve, Update, Delete. A term derived from the four standard data manipulation operations on records in a database.
[v] https://docs.microsoft.com/en-us/dotnet/csharp/programming-guide/classes-and-structs/partial-classes-and-methods

Sitecore content search and LINQ (part 2)

This is part 2 of my post about Sitecore content search and LINQ. For the first part see https://sionict.wordpress.com/2015/09/19/sitecore-content-search-and-linq-part-1/.

In the first part we set up our custom index and created a piece of code to use it. Now let’s say that we don’t only have products but also services to list on our site. We created a Sitecore template for them and want our Service items to be in the index as well. So we added our Service template ID to the index as we did before with the product template ID:

<serviceTemplateId>{guid}</serviceTemplateId>

Don’t forget to regenerate your index and check with Luke the contents are there after publishing your first Service items! Sometimes you may have to restart IIS and regenerate again to get the index working. If it keeps failing clear the whole custom index folder in the Sitecore data folder and try to regenerate again.

We create a class for our service entries in our code:

using Sitecore.ContentSearch.SearchTypes;
namespace ScPredicateBuilderApp.DataObjects
{
   public class ServiceSearchResult:SearchResultItem
   {
      public string ServiceName { get; set; }
      public string Description { get; set; }
      public double Rate { get; set; }
   }
}

and some code to retrieve them from the index similar as we did for products but with the above class for type parameter in GetQueryable(). Of course C# generics can be very convenient to prevent a lot of duplicate code so you may want to change the SearchProducts() method of the previous part into something more generic:

/// <summary>
/// Get the index contents using search
/// </summary>
public List<T> SearchIndexContent<T>() where T : SearchResultItem
{
   ISearchIndex myIndex = ContentSearchManager.GetIndex("sitecore_myindex");
   using (var context = myIndex.CreateSearchContext())
   {
      var query = context.GetQueryable<T>();
      var result = query.ToList();
      return result;
   }
}

Now when calling this method pass either ProductSearchResult or ServiceSearchResult as type parameter to get a list of the corresponding type.

However… as said before Lucene entries are generic (“documents”) and Sitecore has no way of knowing what entry belongs to what type (class) you specified based on the index alone. It will try to map ALL entries to the type you passed, leaving empty properties for fields it cannot map. So the first thing we have to do when using contentsearch is add a filter on template to the Lucene query.

Adding a template filter

Where and how we pass the template ID to filter on is a matter of conventions in your working environment or personal preference. You can put a base class between SearchResultItem and your content search classes where you retrieve the template ID (and to define common properties), or pass it as a parameter when calling the above method. In this case I do the latter. So we add a parameter to the method:

public List<T> SearchIndexContent<T>(string templateId) where T : SearchResultItem

In this case I defined some configuration somewhere for my project and have the Sitecore IDs as string, passing them as such for parameter. Now we add a line in our using block and extend our query line with a filter on the template. In other words replace the “var query = ....” line with:

var tId=new ID(templateId);
var query = context.GetQueryable<T>().Filter(t=>t.TemplateId==tId);

Filter() is part of the LINQ interface Sitecore has implemented for contentsearch. There is also a Where() implementation. The difference is that Where returns a result taking Lucene’s scoring system into account whereas Filter just returns the result set. Since we don’t do anything with the scoring here we use Filter.

As with any LINQ implementation on a datasource you have to assign external parameters to a local variable before entering them into a LINQ expression to avoid so-called “modified closure” issues. These issues can cause nasty bugs that won’t give you a warning or error but can return incorrect results. Also, if the parameter in the Filter() expression is the result of a function or calculation you have to assign it first to a local variable or it will cause runtime exceptions. For example combining the above into something like:

var query = context.GetQueryable<T>().Filter(t=>t.TemplateId== new ID(templateId));

will NOT work. So always use a local variable as parameter in LINQ expressions.

You can chain LINQ methods just like any other LINQ implementation. Sitecore will take the expression and convert it to a Lucene query, returning the result as an IQueryable. Let’s say we want our products sorted on creation date, so something like:

var query = context.GetQueryable<T>().Filter(t=>t.TemplateId==tId).OrderByDescending(p=>p.CreatedDate);

will work. CreatedDate comes from a computed field that was already defined by Sitecore in our index configuration file as “__smallcreateddate”. It is one of the predefined properties on SearchResultItem.

Since our generic method takes SearchResultItem as type, we don’t have our product- or service specific properties here. However you can refer to fields directly using the Lucene field names and indexer. Let’s say we have a boolean field “Available” added to our Product template, we can do something like:

var query = context.GetQueryable<T>().Filter(t=>t.TemplateId==tId).Filter(p=>p["available"]=="1");

Note that using the Fields collection for the parameter (p=>p.Fields[“available”]) instead of the indexer uses a function get_Item() internally, causing an exception. It is one of many quirks you have to be aware of when using LINQ for contentsearch. Also you’re referring values as they are in Lucene this way, which means they’re not type-converted and thus might not give the results you expect. The above is therefore a string comparison.

Of course we now also have a problem for our ServiceSearchResult entries since they don’t have a field “available”. It results in a null value for the field and thus the above expression is always false.

Building LINQ expressions dynamically

So what if we want to have expressions depending on our item type? We could go back to making type-specific search functions, having to duplicate the template filtering expression (and probably more) in each one. Moreover, in real-world scenario’s we may not know in advance what expressions may be needed (a user may or may not select a specific search option). In short, we want to build our search expression from separate parts.

In comes Sitecore’s PredicateBuilder. This class resides in the Sitecore.ContentSearch.Linq.Utilities namespace and can be used to create and manipulate (partial) LINQ expressions for a given type. You start by either calling the True() or False() method, where you use True() for operations than need to be combined using logical “And “ and False() for logical “Or”. Then you use the And() or Or() methods for the comparison expression. The current version also has a Create<T>() method but at his point I cannot confirm it works the same as the True/And and False/Or method combinations under all circumstances.

These methods return an object of type System.Linq.Expressions.Expression. To make this all clear it is best to show an example. Say we put the expression for available products in a separate method, it will look like this:

/// <summary>
/// Return an expression for filtering on available products only
/// </summary>
/// <returns></returns>
public Expression<Func<ProductSearchResult, bool>> GetAvailableProductsExpression()
{
   var predicate = PredicateBuilder.True<ProductSearchResult>();    //True for "And"
   predicate = predicate.And(p => p.Available == true);
   return predicate;
}

Now we add a parameter of type Expression<Func<T, bool>> to our SearchIndexContent method, and pass the result of the above in when calling the search for products. For services we don’t pass in a specific expression and use a dummy expression in our SearchIndexContent method. It now looks like this:

/// <summary>
/// Get the index contents using search
/// </summary>
public List<T> SearchIndexContent<T>(string templateId, Expression<Func<T, bool>> expression = null) where T : SearchResultItem
{
   ISearchIndex myIndex = ContentSearchManager.GetIndex(Constants.IndexName);
   using (var context = myIndex.CreateSearchContext())
   {
      var tId = new ID(templateId);
      // Dummy if null
      var exp = expression ?? PredicateBuilder.False<T>().Or(p => true);
      var query = context.GetQueryable<T>().Filter(t => t.TemplateId == tId).Filter(exp);
      var result = query.ToList();
      return result;
   }
}

So to get our products we call it with:

var products = SearchIndexContent<ProductSearchResult>(Constants.ProductTemplateId, GetAvailableProductsExpression());

And to get our services:

var services = SearchIndexContent<ServiceSearchResult>(Constants.ServiceTemplateId);

(Note for this example code I defined the IDs of the templates as strings in a static Constants class.)

Predicate expressions can be combined and nested. Instead of putting a boolean expression in the And() or Or() methods directly you can put the result of another PredicateBuilder in. This way you can logically combine “And” and “Or” expressions, create large complex expressions from smaller parts and “inject” expressions based on user filter selections for example.

Using PredicateBuilder in general

As mentioned before the resulting expression of a PredicateBuilder operation is of a .NET type and not a Sitecore type. The PredicateBuilder is not tied to Sitecore’s content search itself. As far as I know you can use it with any type that can be cast using AsQueryable(). This can be quite handy, for example to do post-search filtering using the facets from the query.getResults() method mentioned in part 1 of this article. Or to perform post-search operations on the result set that Sitecore’s LINQ parser cannot translate to a search query.

There’s a lot more to Sitecore’s content search, LINQ and the PredicateBuilder than can be covered by this article. Unfortunately Sitecore’s documentation about it is far from complete, so you’ll have to look around on the web and experiment to figure out all that’s there.

Getting the demo code

I have uploaded a sample application to GitHub at https://github.com/mcrvdriel/ScPredicateBuilderDemo. It contains a simple Visual Studio 2013 solution, and a folder containing a package you can import into Sitecore to get the templates and some items. Since Sitecore libraries are proprietary software they are not included and you have to add them to the solution yourself. You need to have a working Sitecore 8 on your environment to be able to use this code, and set the demo project to publish to it.

Sitecore content search and LINQ (part 1)

With the rise of cloud services and today’s requirements for data retrieval, traditional relational- and tree data models with their query structures tend to be replaced with non-relational models and search-based query mechanisms. Within Sitecore this shift is noticable with the introduction of item buckets and the use of Sitecore contentsearch for larger amounts of data. Using a search engine based storage- and search mechanism does not only improve performance drastically but also improves flexibility of data retrieval. Since the release of Sitecore 7, Sitecore wants developers to favour indexes over the database for performance reasons. See http://www.sitecore.net/learn/blogs/technical-blogs/sitecore-7-development-team/posts/2013/06/sitecore-7-poco-explained.aspx

It is well known Sitecore may run into performance issues when a lot of items are stored in the content tree and searched by querying the traditional way. But also consider a scenario that’s not uncommon today:
Let’s say we have a user that got a gift card and wants to browse our webshop to see what’s available for the gift card amount. So he or she wants to enter something that translates to a query like “show me all products with a price tag less than …” or “all products added to the shop since …”. This cuts right through all categories so we’d have to query the whole tree structure, filtering out all products that do not apply. With a complex tree structure and a large number of product items, even Sitecore’s fast query will give performance issues.

By default Sitecore uses the open source Lucene engine but this can be replaced by the SOLR engine provider that comes with Sitecore, or another custom or 3rd party provider. When converting a folder in the content tree to an item bucket, it will use an index to retrieve items when you enter a search expression in the bucket’s search box. However you don’t need to use item buckets for using contentsearch; it works just as well with items stored in a conventional tree structure.

When showing our products on a public site, the information will normally come from the Sitecore web database, on which also a Lucene index is defined. However when using contentsearch, Sitecore recommends creating your own custom index instead of using the web index for a number of reasons:

  • The web index contains references for (almost) all items in the web database, decreasing performance;
  • By default the index does not store whole values, so you’d still have to retrieve individual items from the database;

By creating your own custom index (or multiple indices) you can specify to only contain the items you need, and store necessary data in the index so you don’t have to retrieve it from the database.

Creating a custom index

For creating a custom index you need to create two configuration files based on the web index. I’d recommend also installing Luke or some similar index viewer for troubleshooting. Luke is a Java application (.jar) so you need to install Java also.
There used to be a blog showing the minimum needed to create a custom index but that seems to be no longer online. So I’ll outline the process here:

  • Make a copy of these files in App_Config\Include in the website folder:
    • ContentSearch.Lucene.Index.Web.config ==> rename the copy to Sitecore.ContentSearch.Lucene.Index.MyIndex.config.
    • ContentSearch.Lucene.DefaultIndexConfiguration.config ==> rename the copy to Sitecore.ContentSearch.Lucene.MyIndexConfiguration.config.
  • In Sitecore.ContentSearch.Lucene.Index.MyIndex.config (= index definition):
    • Index node: rename the id from sitecore_web_index to sitecore_myindex.
    • Configuration node: set the ref atribute to “contentSearch/indexConfigurations/myIndexConfiguration”. We will create our own configuration in the other file under this XML path.
    • Note the settings like publishing strategy, database and root for the search. These can be left as-is or changed, i.e. use sitecore/Content/Products folder as root to only include items from this folder in the content tree. Leave the strategy to be “onPublishEndAsync”, which will update the index when publishing.
  • In Sitecore.ContentSearch.Lucene.MyIndexConfiguration.config (=index configuration)
    • Instead of keeping all sections in this file you can replace a lot of them with references to the original in the Sitecore.ContentSearch.Lucene.DefaultIndexConfiguration.config file. See Sitecore documentation and references for this.
    • Add the following directly under the <sitecore> tag:
<!--This section for database is so that the indexes get updated in any environment when an item changes -->
  <databases>
    <database id="web" singleInstance="true" type="Sitecore.Data.Database, Sitecore.Kernel">
      <Engines.HistoryEngine.Storage>
        <obj type="Sitecore.Data.$(database).$(database)HistoryStorage, Sitecore.Kernel">
          <param connectionStringName="$(id)" />
          <EntryLifeTime>30.00:00:00</EntryLifeTime>
        </obj>
      </Engines.HistoryEngine.Storage>
    <Engines.HistoryEngine.SaveDotNetCallStack>false</Engines.HistoryEngine.SaveDotNetCallStack>
  </database>
</databases>
  • Rename the “defaultLuceneIndexConfiguration” XML node to “myIndexConfiguration”.
  • Remove the “Settings” section since it is already in the default configuration file we copied from.
  • “IndexAllFields” must be left to true.
  • Remove the nodes under the “FieldNames” node, EXCEPT the “_uniqueid” one. The “_uniqueid” field is necessary for Sitecore.
  • In “FieldTypes” remove types you don’t need in the index. For the remaining change STORAGETYPE to YES to have the values stored for these fields in the index. When storing field values in the index you don’t need to retrieve them from the database. It will increase the size of your index but having to go to the database after each search would more or less nullify the performance you get from using search.
  • Uncomment the <include hint="list:IncludeTemplate"> node and remove the “BucketFolderTemplateId” node. We will specify the templates in here we want items to be indexed from.
  • Go through the other “field=” nodes to see if you’re ok with them (like excluding certain fields).

When done, log into Sitecore as admin, and go to the control panel and indexing manager. Your new index should be listed so you can (re)generate it (you may have to restart your site or webserver first). Once done it should have created a folder sitecore_myindex in your Sitecore data folder that you can open with Luke to see the new index contents. A number of fields (most of them starting with an underscore) will always be present as Sitecore indexes them by default.

The above creates a working custom index for a single server setup using Lucene. For multiple server setups SOLR may be a better solution than Lucene, and more configuration may be required depending on your environment.

Be aware that with using the “onPublishEndAsync” publishing strategy there may be a small delay between an item being published and the index being updated. This can result in an updated or newly created item not showing directly on your web site.

Storing information in the new index

We will be using our index for specific items. Let’s say we have to build a website showing product information, and defined a template “Product” in Sitecore for creating product items. In Sitecore.ContentSearch.Lucene.MyIndexConfiguration.config, in de <include hint="list:IncludeTemplate"> node, add a node:

<productTemplateId>{guid}</productTemplateId>

where {guid} should be the guid of your product template. Rebuild the index from the Sitecore control panel. Now when creating an item based on the “Product” template it should be present in the index after publishing. Using Luke, open your custom index and look for fields you defined in the template or check “_uniqueid” to see if the guid of the new item is present. Note that every time you add or change something to the configuration file or when changing a template you need to regenerate the index from Sitecore’s control panel!

Using the custom index

We can now use our custom index from code for retrieving data without going to the database, which is way faster and can be used to search through and retrieve large numbers of items. Assuming you have already set up a Visual Studio project for your Sitecore site, you need to add references to Sitecore.ContentSearch.dll and Sitecore.ContentSearch.Linq.dll to your project. Then create a class to reflect the product information from the index:

using System;
using System.ComponentModel;
using Sitecore.ContentSearch;
using Sitecore.ContentSearch.SearchTypes;

public class ProductSearchResult: SearchResultItem
{
  public string ProductName { get; set; }
  public string Description { get; set; }
  public string SerialNumber { get; set; }
  public DateTime Released { get; set; }
}

The properties have to correspond with the field names otherwise you will need to annotate them with attributes to map them, and they must have empty public setters. The class has to derive from SearchResultItem, which will also give it properties like ItemId and Name that will correspond with the item in the Sitecore database. Note the name “SearchResultItem” is misleading since the class itself has nothing to do with actual Sitecore items, and searchresults are not linked to a database in any way.

Now add a method somewhere in your code that will use this class for retrieving the data, which can look like this:

/// <summary>
/// Get the products from Sitecore, using search
/// </summary>
public List<ProductSearchResult> SearchProducts()
{
  ISearchIndex myIndex = ContentSearchManager.GetIndex("sitecore_myindex");
  using (var context = myIndex.CreateSearchContext())
  {
     var query = context.GetQueryable<ProductSearchResult>();
     var result = query.ToList();
     return result;
  }
}

First you need to get an ISearchIndex instance from ContentSearchManager by telling Sitecore what index to get. Then we create a search context from this instance. From this context we request a IQueryable for our type which we can cast to a list containing our data from the index.

Instead of using the query directly we can also call GetResults() to retrieve the data from the index along with metadata and faceting. Note that the GetResults() method is an extension method that resides in the Sitecore.ContentSearch.Linq.dll library. The result returned is an object containing a SearchHit collection. Each SearchHit has a property Document that is an object of the type specificied on the query, in our case ProductSearchResult. This is where the data of our search result can be found. So the last two lines of the above could be replaced by the following to return the same result:

var result = query.GetResults();
return result.Hits.Select(p=>p.Document).ToList();

The code from this example will return every entry in the index. Often we will store more than one type of item in an index by entering more than one template in the configuration. The above will always map the entries matching a query to the given type even if the corresponding Sitecore item is of a different type (template), so when having more than one type in the index you’ll have to filter on template. I’ll explain in part 2 of this post how you can request a selection from the index.

WARNING: do NOT use the Dispose() method on the index object or use it in a using context! You will end up with a corrupt index when you do. After filing a bug report Sitecore claimed this is by design and the Dispose() is intended for internal use only (even though there is no tooltip or any documentation about this).

Mapping index fields

The SearchResultItem class has a Fields collection to access the fields by name much similar to a Sitecore Item object. However accessing the field values directly this way bypasses any conversions and mappings done by Sitecore and you get the raw values. Since Lucene is a third party product and using optimizations for storage, you have to be aware of differences in formats between Sitecore and Lucene fields. Most noticably:

  • All field names in the index are in lowercase.
  • All IDs (Guids) are in short ID format (no brackets or hyphens). Sitecore contains operations on ID type objects to convert them.
  • Datetimes are in a format derived from ISO 8601 format.
  • The ItemId property which contains the corresponding item ID is stored in the “_group” field.

On the class derived from SearchResultItem, you can use the [Indexfield] attribute to map an index field to a property explicitly. Note you have to specify the Lucene index field name. The [TypeConverter] attribute can be used to explicitly convert an index field to a type. The Sitecore.ContentSearch.Converters namespace contains specific conversion types for Sitecore, like the [IndexFieldIDValueConverter] to map fields containing IDs.

As an example for our ProductSearchResult class:

[IndexField("released")]
[TypeConverter(typeof(DateTimeConverter))]
public DateTime ReleaseDate { get; set; }

Computed fields and related (media) items

You can store computed field values in an index and set them as property on your SearchResultItem-derived class. This is particularly handy for storing the reference paths to related items like media items, since fields like “image” only store the alt text in the index. By creating a computed field to store the media item reference in you can get your related media item references directly from the index. In Sitecore documentation you can find how to create computed fields. You can add computed fields to the index by adding them to the <fields hint="raw:AddComputedIndexField"> section of your index configuration file. Since you’re storing calculated values be aware of how and when Sitecore updates them or you end up with stale values!

As for the actual content of these media items you still have to get them from the Sitecore media library of course, or use a third party product with a connector to store your media items in. This goes beyond the scope of this article. You  can index the actual content of some types of media items like PDF files by using IFilters, as described by John West in his blog on http://www.sitecore.net/learn/blogs/technical-blogs/john-west-sitecore-blog/posts/2013/04/sitecore-7-indexing-media-with-ifilters.aspx.

I explain about filtering and using the LINQ interface for contentsearch in the next part on https://sionict.wordpress.com/2015/10/06/sitecore-content-search-and-linq-part-2/

ASP.NET session state and authentication

A few weeks after rebuilding a security implementation of an existing ASP.NET webforms system, I got a call from my client saying one of their customers lost their in-session data and was confronted with defaults from the system. A quick look at the logs showed the customer in this case had left the system idle for a while, then returned after a session timeout had occured. As expected the user was redirected to the logon screen, but managed somehow to get back into the system bypassing the logon (although of course not as a different user).

Explanation

After some research looking into the configuration I found the installation of a third-party component we used in the system had added a line to the web.config:

<sessionState timeout=”20” …….

As for the authentication, the system used ASP.NET forms authentication with the default timeout, which is 30 minutes. At this point it is important to realize that ASP.NET authentication is not connected to a particular session. ASP.NET configured for forms authentication creates an authentication ticket with a timeout that is usually stored in an authentication cookie (with default name “.ASPXAUTH”). Setting the timeout on the forms authentication does NOT set the session timeout, something that is often misunderstood or overlooked in ASP.NET applications.

Apparently the user in this case had a session timeout but after being redirected to the logon page used the browser’s back button BEFORE the authentication timeout occured. The difference between session state timeout and authentication timeout had left a 10 minute window where a user without a session was still authenticated. Since the user still had a valid authentication ticket, the system just created a new session but of course the previously stored session information was lost, presenting the user with default settings.

Synchronize session and authentication

To avoid the above situation from happening, first of all set the authentication timeout and the session timeout to at least the same values. By default authentication uses a sliding expiration unless configured not to, meaning the counter is reset on user activity (but not necessarily after each request). For session state this is always the case.

Depending on your requirements you can choose a strategy to avoid getting sessions out of sync with authentication. One way is to just reinitialize the session if it was expired and the user is still logged in. This is easy if no or little information is kept in relation to the session. Another way is to make sure session ending does end the authentication and vice versa.

Part 1: end authentication when session is expired

To implement ending the authentication after session expiration, first make sure the session sticks by entering something into it, otherwise the session will get renewed on every request. To do this, directly after authenticating the user store the session ID in a session variable. So in a logon form (ASP.NET webforms) or ASP.NET MVC controller it will look something like:

...
//Authentication, validation etc.
....

FormsAuthentication.SetAuthCookie(UserName, false);
Session["__MyAppSession"] = Session.SessionID;
..

Since we are going to bind our authentication to the session, it is pointless to set the createPersistentCookie parameter to true of course.

Now we can check on any request if we still have the session active, and if not log out the user. The exact place to do this can be tricky and causes a lot of questions on forums and such, but arguably the best location is in the Application_AcquireRequestState event in global.asax.

Since there’s no guarantee we have a valid user or session in this event, we need to do a lot of null-checking. The code will look like this:

void Application_AcquireRequestState(object sender, EventArgs e)
 {
 var session = System.Web.HttpContext.Current.Session;
 if (session == null || string.IsNullOrWhiteSpace(session.SessionID)) return;
 var userIsAuthenticated = User != null &&
 User.Identity != null &&
 User.Identity.IsAuthenticated;
 if (userIsAuthenticated && !session.SessionID.Equals(Session["__MyAppSession"]))
 {
 Logoff();
 }
 // part 2 gets here
 }
private void Logoff()
{ 
    FormsAuthentication.SignOut(); 
    var authCookie = new HttpCookie(FormsAuthentication.FormsCookieName, string.Empty) { Expires = DateTime.Now.AddYears(-1) }; 
    Response.Cookies.Add(authCookie); 
    FormsAuthentication.RedirectToLoginPage(); 
}

Now, if a request is sent while the session expired with the user still authenticated, the stored ID (or actually the session variable “__MyAppSession”) will no longer be present and the user will be logged of. The part with the cookie I will explain below.

Part 2: end session when authentication ends (timeout)

This can be added quite easily. After the comment in the above event code, add the following:

if (!userIsAuthenticated && session.SessionID.Equals(Session["__MyAppSession"]))
{ 
    ClearSession();
}

And the ClearSession method:

private void ClearSession()
{
    Session.Abandon();
    var sessionCookie = new HttpCookie("ASP.NET_SessionId", string.Empty) { Expires = DateTime.Now.AddYears(-1) };
    Response.Cookies.Add(sessionCookie);
}

Now if a session exists with session information while the user is no longer authenticated, the session will be abandoned.

Part 3: full user logout

When a user actively logs off we have to clear both the authentication and the session. Since this is usually a single “fire and forget” operation that can be called from various places it’s usually implemented best as a static operation in a logical place.

Per recommendation it is best not to only call FormsAuthentication.SignOut() and session.Abandon(), but to actively overwrite the cookies with ones having expired dates. So a full logoff will look like this:

public static void Logoff()
{
    FormsAuthentication.SignOut();
    Session.Abandon();
    var authCookie = new HttpCookie(FormsAuthentication.FormsCookieName, string.Empty) { Expires = DateTime.Now.AddYears(-1) };
    Response.Cookies.Add(authCookie);
    var sessionCookie = new HttpCookie("ASP.NET_SessionId", string.Empty) { Expires = DateTime.Now.AddYears(-1) };
    Response.Cookies.Add(sessionCookie);
    FormsAuthentication.RedirectToLoginPage();
}

In the above Session and Response are of course from the context (controller or form).

A note on ASP.NET MVC: per recommendation MVC applications should be stateless as much as possible, and not store information in the session. In this case it doesn’t really matter if the session gets renewed automatically while a user is still authenticated since there shouldn’t be any persistent information in there anyway.

Sitecore 7.0 with Windows Identity Foundation 4.5 security

Recently I found myself on a project with the task to implement signin for a new intranet platform based on Sitecore 7.0, using MVC and .NET 4.5 running in the Windows Azure cloud platform. Per requirements the end customer didn’t want to maintain user information within Sitecore but use multiple ADFS 2.0- and other domains for authentication. The Azure Access Control Services (ACS) would be the central gateway for user authentication.

The requirements called for a federated security model combined with Sitecore virtual users. Since the platform is all .NET 4.5 the logical choice for implementing the federated security was the Windows Identity Foundation (WIF). For using WIF within a .NET application Microsoft already provides a lot of examples, many of them not even requiring code but using configuration only. However almost all of these examples apply to a more or less standard .NET application and won’t work within a Sitecore environment. Main reason for this is differences between the .NET 4.5 claim-based security implementation and the Sitecore security model. Also with .NET 4.5 WIF has been fully integrated into the .NET framework core and therefore has some differences with earlier versions.

WIF, ACS and Sitecore

In this post I’ll explain a way on how to implement WIF security in a Sitecore 7.0/ MVC environment. For this article I assume the reader is familiair with terms and abbreviations used in federated security and WIF. If not there’s plenty of information to be found on the net. I also won’t go into details of setting up ACS itself or (the trusts with) ADFS 2.0 domains. A detailed explanation of how this works can be found at http://azure.microsoft.com/en-us/documentation/articles/active-directory-dotnet-how-to-use-access-control/ and related articles.

For the rest of this article you should have created and configured an ACS namespace for you or your organization with at least one identity provider. For legal reasons any code, configuration and references here are examples and not from the actual project. In production environments more exception- and security handling is required.

Steps involved

In our implementation, when a user that’s not yet authenticated goes to the site, the following main steps take place:

  1. Our system makes a request to the ACS to retrieve the list with information for the configured identity providers. The user is redirected to our “login” page, which is similar to normal forms authentication except the user is presented with the list of identity providers instead of a username/ password page, and has to pick the provider of his or her choice;
  2. After picking a provider, the system calls the login url that came with the information from the ACS. If the user is already authenticated with this provider, it immediately returns a security token with a claimset. If not, the user is presented with a login page or box by this provider and has to log in;
  3. The token and claimset is returned through the ACS which may or may not transform or add any information, depending on how it is configured. The ACS returns a security token and the claimset to our system;
  4. WIF intercepts the returned information and performs the necessary checks and steps. See the MSDN pages on “WSFederationAuthenticationModule” for more information;
  5. Our system retrieves the information (claims) through the WIF modules.
  6. With this information we create a Sitecore virtual user, add the necessary roles and attributes to it and log in.

Step 1: Get a list of registered (trusted) IP’s from ACS

First step is to get the configured providers information from our ACS. This can be done by a call to a Javascript endpoint that exists on the ACS. This call can have a bunch of query parameters, of which three are required:

  • protocol, in this case wsfederation
  • version, in this case 1.0
  • realm, the url of your (future) web application that has been configured in your ACS portal as a RP (Relying Party) application.

Let’s say we have configured http://localhost/ on our ACS as relying party, it will look like the following:


https://namespace.accesscontrol.windows.net/v2/metadata/IdentityProviders.js?protocol=wsfederation&version=1.0&realm=http://localhost

where namespace is the namespace that you registered with ACS for you or your organization. This call will return a JSON structure containing an array of objects (one for each configured identity provider) that translates to the following C# class:

[Serializable]
public class IdentityProvider
{
  public List<string> EmailAddressSuffixes { get; set; }
  public string ImageUrl { get; set; }
  public string LoginUrl { get; set; }
  public string LogoutUrl { get; set; }
  public string Name { get; set; }
}

Note that the returned JSON structure uses Microsoft C# naming convention and not the common Javascript convention. When using JSON.NET to deserialize the response the code for sending the request and getting the result will look like the following:

 List<IdentityProvider> Providers;
 using (System.Net.WebClient webClient = new System.Net.WebClient())
 {
   webClient.Encoding = System.Text.Encoding.UTF8;
   string jsonResponse = webClient.DownloadString(requestString);
   Providers = JsonConvert.DeserializeObject<List<IdentityProvider>>(jsonResponse);
 }

Where RequestString must contain the request as shown before. We used a simple form with a submit button and a dropdown box. The dropdown listed the Name property of each provider, and used the serialized provider object for value so we didn’t have to store anything in session variables or hidden fields, keeping our application stateless as recommended for MVC applications.

Step 2: Request authentication from chosen IP

Once the user selected an identity provider (IP), we deserialize the value back to our Provider object and use the value of the LoginUrl property to request authentication from that IP. In an MVC environment we can do this very easily by returning a Redirect action result to that URL. The LoginUrl property should contain the full URL (including going through the ACS) with all information required. Let’s say the user selected an IP and the submit action calls this Controller method (The SelectedIdentityProvider parameter should contain the value property of the chosen provider from the dropdown):

[HttpPost]
[System.Web.Http.AllowAnonymous]
public ActionResult ProviderSelected(string SelectedIdentityProvider)
{
IdentityProvider provider;
…
//Code to retrieve the SelectedIdentityProvider object and assign it to provider
…
return Redirect(provider.LoginUrl);
}

If not yet authenticated the user should be presented with a login box or screen by that IP. Once authenticated, the IP returns the issued security token to our ACS namespace which was set as the wreply parameter in the Login URL.

Step 3: Returning the security token and claims

On the ACS portal we should have configured our Sitecore application as relying party and set the “Return URL” field to the URL of a controller method that handles further login. Optionally we can set a “Error URL” and implement an error handling controller method in case something went wrong on the IP side.

The ACS calls back to our application on the return URL. This return call should be intercepted and processed by the WIF modules (see next step) and then WIF actually calls the return URL on our application. This processing involves validating the returned token and then creating a ClaimsPrincipal, using this to create a session security token. Because the WIF modules reside in the ASP.NET pipeline the security can be implemented in a standard .NET application using configuration only. However this ClaimsPrincipal is an IPrincipal implementation and this is where the problem arises within a Sitecore 7.0 environment, since Sitecore security and users do not derive (yet?) from this claims model.

Step 4: Setting up WIF to process the returned security token and claims

The core modules here are the WSFederationAuthenticationModule and the SessionAutentication modules, which both exist as properties on the FederationAutentication static class. In .NET 4.5 the classes reside in the System.IdentityModel.Services Namespace. You need to add references to System.IdentityModel and System.identityModel.Services in your project. Note that the second reference may (accidentally?) contain a lowercase “i”, violating the usual Microsoft naming convention.

We derived our own ScFederationAuthenticationModule and ScSessionAuthentication from these classes since in both(!) classes we need to override the InitializeModule, the InitializePropertiesFromConfiguration and the OnAuthenticateRequest methods. We define two boolean properties moduleInitialized and propertiesInitialized on each of our our derived classes. See also http://msdn.microsoft.com/en-us/library/system.identitymodel.services.httpmodulebase.init(v=vs.110).aspx for this.

protected override void InitializeModule(System.Web.HttpApplication context)
{
  if (this.moduleInitialized) return;
  this.moduleInitialized = true;
  base.InitializeModule(context);
}

protected override void InitializePropertiesFromConfiguration()
{
  if (this.propertiesInitialized) return;
  this.propertiesInitialized = true;
  base.InitializePropertiesFromConfiguration();
}

protected override void OnAuthenticateRequest(object sender, EventArgs args)
{
  // Skip event if Sitecore user already authenticated.
  if (Sitecore.Context.User != null && Sitecore.Context.User.IsAuthenticated)
  {
    return;
  }
  base.OnAuthenticateRequest(sender, args);
}

The overrides are necessary to prevent WIF from interfering after we have created and signed in our (virtual) user in Sitecore.

The WIF modules need to be in the ASP.NET pipeline so they need to be added to web.config. Under the <modules> node in <system.webServer> add the following 2 entries:

<add name="WSFederationAuthenticationModule" type="SitecoreFedSecurity.ScFederationAuthenticationModule, SitecoreFedSecurity" />
<add name="SessionAuthenticationModule" type="SitecoreFedSecurity.ScSessionAuthenticationModule, SitecoreFedSecurity" />

with SitecoreFedSecurity being the namespace and assembly name for our derived classes. These entries need to be right after the Sitecore.Nexus.Web.HttpModule entry.

We then need to define 2 configuration sections for these modules:

<section name="system.identityModel" type="System.IdentityModel.Configuration.SystemIdentityModelSection, System.IdentityModel, Version=4.0.0.0, Culture=neutral, PublicKeyToken=B77A5C561934E089" />
<section name="system.identityModel.services" type="System.IdentityModel.Services.Configuration.SystemIdentityModelServicesSection, System.IdentityModel.Services, Version=4.0.0.0, Culture=neutral, PublicKeyToken=B77A5C561934E089" />

And the definitions of these sections:

<system.identityModel>
  <identityConfiguration>
    <audienceUris>
      <add value="http://localhost/" />
    </audienceUris>
    <securityTokenHandlers>
      <add type="System.IdentityModel.Services.Tokens.MachineKeySessionSecurityTokenHandler, System.IdentityModel.Services, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089" />
      <remove type="System.IdentityModel.Tokens.SessionSecurityTokenHandler, System.IdentityModel, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089" />
    </securityTokenHandlers>
    <certificateValidation certificateValidationMode="None" />
    <issuerNameRegistry type="System.IdentityModel.Tokens.ConfigurationBasedIssuerNameRegistry, System.IdentityModel, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089">
    <trustedIssuers>
      <add thumbprint="[certificate thumbprint]" name="https://sionict.accesscontrol.windows.net/" />
    </trustedIssuers>
  </issuerNameRegistry> 
</identityConfiguration>
</system.identityModel>
<system.identityModel.services>
  <federationConfiguration>
    <cookieHandler requireSsl="false" />
    <wsFederation passiveRedirectEnabled="false" issuer="https://[namespace].accesscontrol.windows.net/v2/wsfederation" realm="http://localhost/" requireHttps="false" persistentCookiesOnPassiveRedirects="false" />
  </federationConfiguration>
</system.identityModel.services>

This configuration is explained in various articles about WIF so I won’t go into detail here. A note though about the issuerNameRegistry node: there seems to be two variants, of which this is the one that currently seems to work. Previously you needed to add System.IdentityModel.Tokens.ValidatingIsserNameRegistry from NuGet (which has a different setup) to your project and which can be found in various examples on the net, but that doesn’t seem to work anymore. See “http://stackoverflow.com/questions/23692326/adfs-2-0-microsoft-identityserver-web-invalidrequestexception-msis7042.

Replace the references to localhost with your own application’s URL if it is different. There’s two other things here specific for your situation: [namespace] should be replaced with your specific ACS namespace, and [certificate thumbprint] is the thumbprint of the X.509 certificate for your ACS namespace. It can be found in the ACS portal under “Certificates and Keys”.

Aside from the above configuration your application should expose a FederationMetadata.xml which simplifies maintenance on the ACS. See the Microsoft documentation on this. The location of this file and a few other paths need to be accessible for any (anonymous) user. See the Notes at the end of this post.

Step 5: Retrieving claims information

Now that we have WIF set up in our system, we can implement the Controller method we have set as return URL on the ACS to retrieve the information from the claimset. Since we need to create and authenticate a Sitecore user, we need to retrieve the necessary information from WIF and perform a few checks. In our case we named the controller method for the return URL SignIn:

public string SignIn()
{
  bool result = false;
  System.IdentityModel.Tokens.SessionSecurityToken sessionToken = null;
  System.Security.Claims.ClaimsPrincipal claimsPrincipal = null;
  try
  {
    result = ((SCSessionAuthenticationModule)FederatedAuthentication.SessionAuthenticationModule).TryReadSessionTokenFromCookie(out sessionToken);
    if (result) claimsPrincipal = sessionToken.ClaimsPrincipal;
  }
  catch (System.Exception ex)
  {
    return string.Format("Could not retrieve session security token cookie. Could not create user. Exception: {0}", ex.Message);
  }
  //Check status
  if ((claimsPrincipal == null) || (claimsPrincipal.Identity == null))
  {
    return "No claimsPrincipal is set. Could not create user";
  }
  if (!claimsPrincipal.Identity.IsAuthenticated)
  {
    return string.Format("Chosen identity provider did not authenticate identity {0}", claimsPrincipal.Identity.Name);
  }
  //TODO: Create a virtual user based on the principal
}

We access the securitytoken cookie set by WIF through the TryReadSessionTokenFromCookie method of the SessionAuthenticationModule. Despite the “Try..” naming of the method it still throws an exception if the cookie could not be read, so you need to add exception handling here. After getting the token you need to verify the principal is present, and the user is actually authenticated by the IP.

Step 6: Creating the (virtual) Sitecore user and log in

Now that we have the information from the identity provider we can create and log in our Sitecore virtual user. Replace the “TODO” comment in the above code with the following:

string identifier = (string.IsNullOrWhiteSpace(claimsPrincipal.Identity.Name)) ?
  		claimsPrincipal.Claims.FirstOrDefault().Value :
		  claimsPrincipal.Identity.Name;
Sitecore.Security.Accounts.User user = Sitecore.Security.Authentication.AuthenticationManager.BuildVirtualUser("extranet\\" + identifier, true);
//Add any roles or attributes for the user here, before login
Sitecore.Security.Authentication.AuthenticationManager.LoginVirtualUser(user);
return string.Empty;

The Name property should be set by WIF from the corresponding claim, but not all identity providers include a name in the claimset so it can be null. Windows Live for example only returns an unique ID. In our example here we pick the first claim from the set but which claim you need depends on your situation. We also haven’t set any roles or additional properties here but that should be pretty straightforward using the Sitecore API.

Signing out

Besides a login URL, the identity provider information also contains a signout URL we can use to sign out the user with the chosen IP. Completely signing out can involve 4 steps:

  • Sign out of Sitecore with AuthenticationManager.Logout();
  • Sing out WIF with the SignOut() method on the WSFederationAuthenticationModule (or rather our derived class);
  • Sign out with the identity provider with WSFederationAuthenticationModule.FederatedSignOut(..), using the signout URL;
  • Sign out with ACS using WSFederationAuthenticationModule.GetFederationPassiveSignOutUrl(..)

Federated security and signing out can be problematic. It is up to the identity provider if and how to process a signout request, and it may not be possible to sign out because it completely ignores these requests. Other providers abort the above sequence because they do not return after the signout request but display a message page instead. There has been quite some criticism towards Microsoft also for providing plenty of examples for federated security signin but little examples about signing out. Be aware that especially in public environments, even after the above steps (and even after closing the browser as some providers instruct you to do!), the user may not be signed out by the IP, causing an automatic authentication without having to login on a subsequent session.

Notes

A few things must be kept in mind when implementing this security model:

  • The Sitecore CMS still needs the built-in users to be able to access through the CMS login page. Also the FederationMetaData should be accessible, and possibly some other paths containing styles, images or scripts. We use the <location> configuration setting to give access to all users on these folders:
    <location path="FederationMetadata">
      <system.web>
        <authorization>
          <allow users="*" />
        </authorization>
      </system.web>
    </location>
    

    Unfortunately the <location> setting can take only one path so you need to create an entry like this for every path.

  • Make sure you have set up MVC routing properly for your Sitecore environment for the callbacks from the ACS to work.
  • Both WIF modules contain an OnAuthenticateRequest. As it turns out this name is somewhat confusing as it is called on every request, and the actual check whether or not it is a request for authentication is done within the (base) implementation of this method.
  • When hooking up WIF events, be aware that the WIF modules are alive al long as the session is active because they are set as properties on the (static) FederationAutentication class, but MVC objects like controllers are disposed between calls. So when WIF is processing a request and firing the various events before calling the return URL, there is no MVC controller alive.
  • Since this all involves security and users, I got remarks that there should be an ASP.NET MembershipProvider somewhere. It is possible to implement certain parts in methods of a custom MembershipProvider if it needs to be enriched with information from Sitecore or another local storage. Do realize membershiproviders are nothing more than an abstraction between user information storage and applications, and within a federated security model this is is all delegated to the identity provider.