Friday 13 April 2007

Improving build performances: assemblies on the GAC

Working with a Monorail based web project, to improve build performances, I added some of the Castle Project assemblies to the GAC.

Obviously, there is a time when you need a fast build, and there is a time when you need to be sure that all the assemblies your application is using are in the /bin folder, so I wrote 2 batch files to cache and uncache the assemblies listed in 2 text files (some example code at the end of the article).

I tested on my machine with a sizable web application project the effect of having or not having those assemblies on the GAC on the build time, and this are the rounded average results:

Solution
GAC 10"
No-GAC 27"

Web
GAC 5"
No-GAC 22"

That is, on my machine when I put the files on the GAC, when building I gain 17".

[Note: The web build is normally 5" faster than the Solution build because in the solution there are project not directly needed by the Web (In primis the tests, then some data migration and seeding stuff, some productivity measurament tool, and so on).]

Why it happens? My best answer is that having the Castle project assemblies in the GAC means that during the build these are not copied anymore in the various /bin folders.

I do advice you to try to test this in your machines as well, anyway, please read carefully and make sure you understand the next points before moving ahead:

- the build phase generate assemblies from the source code. Those assemblies are bytecode, not executable programs, and if not opportunely precompiled, they will be compiled at the startup of the .Net application. This means that after building, when you will refresh the web page, this will take exactly the same time to answer you as it was taking before putting the Castle Project assemblies on the GAC.

- I had to modify the web.config, because the assembly Castle.ActiveRecord has to be explicitly added this way.

- You may notice I also added Castle.Monorail.Framework and Castle.Monorail.Views.Brail. Unfortunately it seems IIS is requiring those two files before reading all the web.config, so I add to change the property Copy Local to true (so in the /bin folder of the web application project you still can find these two assemblies and their related files). This means that you need to slightly the web application project as well to get those latest needed changes.

- I didn't test the web with a web site project configuration, if you happen to use such, you will probably get an error due missing assemblies. I do not advice to use such project option for an unending list of reasons, but if you really wish to persist on your own way, you should solve that just making sure the appropriate assemblies are copied in the /bin folder of the web application. Manually (I mean with some pre o post build event) copying files is a bad practice, because it will worsen the performances of the build, so exercise with care.

- You may be tempted to modify the list of the assemblies to include in the GAC. Please exercise care when doing so, because you can only add strongly typed assemblies this way. In example Newtonsoft.Json is not in that list because it is not a strongly typed assembly.

- Microsoft doesn't recommed to put in the GAC the assemblies you are developing. That is, MyDomainModel.dll or MyInitialisation.dll have still to be keept outside the GAC. If in the future you stabilize one of those libraries, you may decide to sign it (to make it strongly typed) and that to add it to the GAC.

I tested all of this on this Monorail based web project, but its results will probably be more or less similar with normal ASP.NET projects.

Batch file to cache assemblies

pause
"C:\Program Files\Microsoft.NET\SDK\v2.0 64bit\Bin\gacutil" /f /il .\toCache.txt
pause


Batch file to uncache assemblies

pause
"C:\Program Files\Microsoft.NET\SDK\v2.0 64bit\Bin\gacutil" /f /ul .\toUncache.txt
pause


ToCache.txt (list of file to cache, note the relative path to the working directory of the batch file)

..\..\Resources\bin\anrControls.Markdown.NET.dll
..\..\Resources\bin\Boo.Lang.Compiler.dll
..\..\Resources\bin\Boo.Lang.dll
..\..\Resources\bin\Boo.Lang.Parser.dll
..\..\Resources\bin\Castle.ActiveRecord.dll
..\..\Resources\bin\Castle.Components.Binder.dll
..\..\Resources\bin\Castle.Components.Common.EmailSender.dll
..\..\Resources\bin\Castle.Components.Common.EmailSender.SmtpEmailSender.dll
..\..\Resources\bin\Castle.Core.dll
..\..\Resources\bin\Castle.DynamicProxy.dll
..\..\Resources\bin\Castle.Monorail.ActiveRecordSupport.dll
..\..\Resources\bin\Castle.Monorail.Framework.dll
..\..\Resources\bin\Castle.Monorail.Views.Brail.dll
..\..\Resources\bin\Castle.Services.Logging.Log4netIntegration.dll
..\..\Resources\bin\Iesi.Collections.dll
..\..\Resources\bin\log4net.dll
..\..\Resources\bin\NHibernate.dll
..\..\Resources\bin\NHibernate.Caches.SysCache.dll
..\..\Resources\bin\NHibernate.Generics.dll
..\..\Resources\bin\Nullables.dll
..\..\Resources\bin\Nullables.NHibernate.dll


Batch file to uncache assemblies

anrControls.Markdown.NET
Boo.Lang.Compiler
Boo.Lang
Boo.Lang.Parser
Castle.ActiveRecord
Castle.Components.Binder
Castle.Components.Common.EmailSender
Castle.Components.Common.EmailSender.SmtpEmailSender
Castle.Core
Castle.DynamicProxy
Castle.Monorail.ActiveRecordSupport
Castle.Monorail.Framework
Castle.Monorail.Views.Brail
Castle.Services.Logging.Log4netIntegration
Iesi.Collections
log4net
NHibernate
NHibernate.Caches.SysCache
NHibernate.Generics
Nullables
Nullables.NHibernate

Thursday 12 April 2007

Web Site Projects vs Web Application Projects

For web development, Visual Studio 2005 (VS 2005) supports two project options: Web Site Projects (WSPs) and Web Application Projects (WAPs). The earlier was built-in with the initial release of VS 2005, the latter were introduced later and are included on the VS 2005 SP1.
WSPs are leveraging the very same dynamic compilation system used by the ASP.NET 2.0 at runtime; when using this option, you don't need a project file (you just need to follow some easy and intuitive folder conventions).
WAPs are leveraging the MSBuild buld system, and when using them, all the code in a project is builded in a single assembly; when using this option, you do need a project file.
There are various pro and cons for each of those models, but if we focalize just on the speed of the build, the WAP option wins hand down every time you build from scratch or rebuild.
Why?
The main reason, as explained from Scott Guthrie, is that with the WAP, only the code-behind and the /app_code/* classes are built, while with the WSP the ASP.NET runtime spend its time analyzing and compiling each ASP.NET related code (content/controls/inline code).
Monorail case:
First and foremost, having a MSBuild capable project, as in the WAP option, helps when you have to integrate your projects with tools like CruiseControl.Net. Now, it is possible to do that also for the WSP option, but it is not as neat.
Second, using Monorail, the amount of ASP.NET pages in the web projects is probable negligible (usually just 1, quite trivial in every chances, which is left there to "fire" the build on certain circumstances), so there is not really any reason to use the deep verification features of the WSP model: there are not ASP.NET related file to build.
From experience and from investigation, I can add that on the web there are various tips and tricks to speed up the WSP, but the point is that as much as you can speed up a WSP build, the main target of this kind of exercise is to make it as fast as a WAP build.
Conclusion:
On non trivial web projects, and when you don't need deep verification of ASP.NET related files, use ever the WAP option.

p.s.: I shouldn't add this, but really, don't use the /app_code folder under almost any circumstance. Put your logic in different layers (VS class projects), and if really you need to use the /app_code folder, leave there only very stable classes.

Wednesday 4 April 2007

Transaction and Concurrency on SQL Server 2005

Ayende recently posted an article about Transaction and Concurrency, and yeah, I read his blog very often, and if you are a .Net developer you should do it as well.
He had noticed that using the ReadCommitted isolation level something was not working as he had expected. I have to confess that when I first read his post and some of the comments, I had superficially concluded that could be related to the emergence of the Phantom phenomenon.

The SQL standard (at last SQL-92 and SQL-99) is defining the Phantom phenomenon as such:

P3 (‘‘Phantom’’): SQL-transaction T1 reads the set of rows N that satisfy some <search>.
SQL-transaction T2 then executes SQL-statements that generate one or more rows that satisfy the <search> used by SQL-transaction T1. If SQL-transaction T1 then repeats the initial read with the same <search>, it obtains a different collection of rows.


Today I tried to test Ayende's code, and I finally understood the issue is far more interesting than what I was supposing (lecture 1: run the code you are trying to understand).

Under the hypothesis of Ayende (which are far from being exotic, but for a little point), the ReadCommitted isolation level doesn't work as expected on Microsoft SQL Server 2005! Kudos to Stuart Carnie, who had grokked this well before me, and pointed out a very interesting article from Tony Rogerson.

Rogerson's conclusions are that on Microsoft SQL Server 2005 the ReadCommitted isolation level (and its ReadCommittedSnapshot sibling) "does not give a point in time view of your data", so when the need arises, the safe way to go may be to use the Snapshot isolation level (which incidentally seems to have been a playtoy of Microsoft Research since 10 years).

Today I played a bit with Ayende's code, and I wrote a little console application I called TransactionsAndConcurrency.exe:
TransactionsAndConcurrency mode ilp ilc [iterations [records]]
The mode argument can be one of "aye", "aye-run" or "ale", ilp and ilc one of "ch", "rc", "rr", "ru", "se", "sn" (respectively Chaos, Read Committed, Repeatable Read, Read Uncommitted, Serializable and Snapshot) while iterations and records should be self-explanative ints.
In example you may call this little console application as such:
TransactionsAndConcurrency aye rc rc
or as such:
TransactionsAndConcurrency ale rc rc 20 500
You will notice that when called with aye, this application is probably working as Ayende's code, with aye-run with a slightly different behaviour (doesn't stop the first time the consumer fetch an "unexpected" amount of rows) instead with ale unexpectedly works (I tested it until with rc and ru until 5000 various times, not a single glitch).
What is the difference? The exotic point on Ayende's hypothesis, that is that its table doesn't have a primary key. If the same exercise is done with a table with a primary key (using a surrogate key through the T-SQL Identity column) everything goes fine.

I have to confess that I am still wondering if this is a bug of Microsoft SQL Server 2005, because those are obviously uncommitted phantom rows that are showing up because something in the range lock of the tables without primary keys is obviously not working as most of us would expect.

If you wish to play as well, here is the code:
using System;
using System.Data.SqlClient;
using System.Threading;
using System.Data;

namespace TransactionsAndConcurrency
{
public class Program
{
#region Vars and Consts
private const int DEFAULT_ITERATIONS = 20;
private const int DEFAULT_RECORDS = 500;

private static int records = DEFAULT_RECORDS;
private static int iterations = DEFAULT_ITERATIONS;
private static IsolationLevel isolationLevelProducer = IsolationLevel.Unspecified;
private static IsolationLevel isolationLevelConsumer = IsolationLevel.Unspecified;
private static string mode;

private static string connectionString = "Data Source=myDB;Initial Catalog=test;User=sa;Pwd=ICannotSay1!;";
#endregion

#region Setup
private static void Setup()
{
SqlConnection connection = new SqlConnection(connectionString);
connection.Open();
SqlTransaction sqlTransaction = connection.BeginTransaction();
for (int i = 0; i < records; i++)
{
SqlCommand sqlCommand = connection.CreateCommand();
sqlCommand.Transaction = sqlTransaction;
if (mode != "ale")
{
sqlCommand.CommandText = "IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[t]') AND type in (N'U')) DROP TABLE [dbo].[t]";
sqlCommand.ExecuteNonQuery();
string create = "CREATE TABLE [dbo].[t]("
+ " [id] [int] NOT NULL"
+ " ) ON [PRIMARY]";
sqlCommand.CommandText = create;
sqlCommand.ExecuteNonQuery();
}
else
{
sqlCommand.CommandText = "IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[s]') AND type in (N'U')) DROP TABLE [dbo].[s]";
sqlCommand.ExecuteNonQuery();
string create = "CREATE TABLE [dbo].[s]("
+ " [id] [int] IDENTITY(1,1) NOT NULL,"
+ " [field] [int] NULL,"
+ " CONSTRAINT [PK_s] PRIMARY KEY CLUSTERED"
+ " ([id] ASC) WITH (PAD_INDEX = OFF, IGNORE_DUP_KEY = OFF) ON [PRIMARY]"
+ ") ON [PRIMARY]";
sqlCommand.CommandText = create;
sqlCommand.ExecuteNonQuery();
}
sqlCommand.Dispose();
}

sqlTransaction.Commit();
Console.WriteLine("Create table");
connection.Close();
}
#endregion

#region TearDown
private static void TearDown()
{
SqlConnection connection = new SqlConnection(connectionString);
connection.Open();
SqlTransaction sqlTransaction = connection.BeginTransaction();
for (int i = 0; i < records; i++)
{
SqlCommand sqlCommand = connection.CreateCommand();
sqlCommand.Transaction = sqlTransaction;
if (mode != "ale")
{
sqlCommand.CommandText = "IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[t]') AND type in (N'U')) DROP TABLE [dbo].[t]";
sqlCommand.ExecuteNonQuery();
}
else
{
sqlCommand.CommandText = "IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[s]') AND type in (N'U')) DROP TABLE [dbo].[s]";
sqlCommand.ExecuteNonQuery();
}
sqlCommand.Dispose();
}

sqlTransaction.Commit();
Console.WriteLine("Drop table");
connection.Close();
}
#endregion

#region Producer
private static void Producer()
{

SqlConnection connection = new SqlConnection(connectionString);
int iteration = 0;
while (true)
{
connection.Open();
SqlTransaction sqlTransaction = connection.BeginTransaction(isolationLevelProducer);
for (int i = 0; i < records; i++)
{
SqlCommand sqlCommand = connection.CreateCommand();
sqlCommand.Transaction = sqlTransaction;
if (mode != "ale")
sqlCommand.CommandText = "INSERT INTO t (Id) VALUES(@p1)";
else
sqlCommand.CommandText = "INSERT INTO s (field) VALUES(@p1)";
sqlCommand.Parameters.AddWithValue("@p1", iteration);
sqlCommand.ExecuteNonQuery();
sqlCommand.Dispose();
}

sqlTransaction.Commit();
Console.WriteLine("Wrote {0} records in iteration {1}", records, iteration+1);
iteration += 1;
connection.Close();
if (iteration == iterations)
return;
}
}
#endregion

#region Consumer
private static void Consumer()
{

SqlConnection connection = new SqlConnection(connectionString);
int iteration = 0;
while (true)
{
connection.Open();
SqlTransaction sqlTransaction = connection.BeginTransaction(isolationLevelConsumer);
SqlCommand sqlCommand = connection.CreateCommand();
sqlCommand.Transaction = sqlTransaction;
if (mode != "ale")
sqlCommand.CommandText = "SELECT COUNT(*) FROM t GROUP BY id ORDER BY id ASC";
else
sqlCommand.CommandText = "SELECT COUNT(*) FROM s GROUP BY field";
SqlDataReader sqlDataReader = sqlCommand.ExecuteReader();
if (sqlDataReader.RecordsAffected != -1)
Console.WriteLine("Read: {0}", sqlDataReader.RecordsAffected);
while (sqlDataReader.Read())
{
int count = sqlDataReader.GetInt32(0);
if (mode != "ale")
Console.WriteLine("Count = {0} in {1} iteration", count, iteration+1);
if (count != records)
{
if (mode == "ale")
Console.WriteLine("Count = {0} in {1} iteration", count, iteration+1);
if (!(mode == "aye-run"))
Environment.Exit(1);
}
}

sqlDataReader.Dispose();
sqlCommand.Dispose();
sqlTransaction.Commit();
iteration += 1;
connection.Close();
if (iteration == iterations)
return;
}
}
#endregion

#region Delete
private static void Delete()
{
SqlConnection connection = new SqlConnection(connectionString);
connection.Open();
SqlTransaction sqlTransaction = connection.BeginTransaction();
for (int i = 0; i < records; i++)
{
SqlCommand sqlCommand = connection.CreateCommand();
sqlCommand.Transaction = sqlTransaction;
if (mode != "ale")
{
sqlCommand.CommandText = "DELETE FROM t";
sqlCommand.ExecuteNonQuery();
}
else
{
sqlCommand.CommandText = "DELETE FROM s";
sqlCommand.ExecuteNonQuery();
}
sqlCommand.Dispose();
}

sqlTransaction.Commit();
Console.WriteLine("Delete data from table");
connection.Close();
}
#endregion

#region Main
private static void Main(string[] args)
{
// string describing the isolation level of the producer
string ilp = string.Empty;
// string describing the isolation level of the consumer
string ilc = string.Empty;

if ((args.Length > 2) && (args.Length < 6))
{
mode = args[0];
if ((mode != "aye") &&amp;amp;amp;amp; (mode != "aye-run") && (mode != "ale"))
Environment.Exit(2);
ilp = args[1];
ilc = args[2];
}
else
Environment.Exit(3);
if (args.Length > 3)
int.TryParse(args[3], out iterations);
if (args.Length == 5)
int.TryParse(args[4], out records);

isolationLevelProducer = getIsolationLevel(ilp);
isolationLevelConsumer = getIsolationLevel(ilc);

try
{
Setup();
Delete();
Thread p = new Thread(Producer);
Thread c = new Thread(Consumer);
p.Start();
c.Start();
while ((p.IsAlive) || c.IsAlive)
{ }
}
finally
{
TearDown();
}
}
#endregion

#region Utils
private static IsolationLevel getIsolationLevel(string isolationLevel)
{
IsolationLevel il = IsolationLevel.Unspecified;

switch (isolationLevel)
{
case "ch":
il = IsolationLevel.Chaos;
break;
case "rc":
il = IsolationLevel.ReadCommitted;
break;
case "ru":
il = IsolationLevel.ReadUncommitted;
break;
case "rr":
il = IsolationLevel.RepeatableRead;
break;
case "se":
il = IsolationLevel.Serializable;
break;
case "sn":
il = IsolationLevel.Snapshot;
break;
default:
il = IsolationLevel.Unspecified;
break;
}
return il;
}
#endregion
}
}