I’m sure you’ve used a DataRow a thousand times before without noticing the quirk. I’m not talking about a bug or missing functionality. No, I’m talking about something much less important, but still very puzzling.

A DataRow is a class that contains objects, 1 object per column. In order to retrieve one of the objects, you access it by the index in the collection or by the key, like so:

DataRow[2]    // returns the object in the third column.
DataRow[“id”] // returns the object in the column named id.

That is similar to a HashTable or generic Collection. In a HashTable you access the object by the key and in the Collection by the index.

Let’s keep our focus on the similarities between a DataRow and a HashTable. Even though they are very different by usage and nature, they are very much alike for querying as described above. In a HashTable you can ask if it contains a certain key by its ContainsKey method. The same goes for all Dictionary type classes, not just the HashTable.

That brings me to the quirk. Feel free to think I’m strange.

You cannot ask a DataRow directly if it contains a certain column. It does not have a method called ContainsKey or ContainsColumn or just Contains. In order to find out if it contains the column, you have to ask its DataTable’s column collection. You have to do this:

bool contains = DataRow.Table.Columns.Contains(“columnName”);

As you can see, you can find out if the column exists in a DataRow quite easily, but that is not my point. The DataRow is so similar to the other Dictionary and Collection type classes regarding to querying, so why doesn’t it have a method called Contains? For all I care, it could just call the DataTable’s column collection behind the scene. Do you see my point now?

Is it a bug, is it a feature or is it missing method. No, it’s a quirk.

There are some features in the System.Data.DataTable class that a lot of developers don’t utilize. I base that statement on different code samples I’ve seen on blogs and article bases during the last couple of years. Some of these features can improve the performance.

Calculated columns

First of all, I’ll create a DataTable manually, even though it is more likely to be created from querying a database.

DataTable dt = new DataTable();
dt.Columns.Add("Name", typeof(string));
dt.Columns.Add("Price", typeof(double));
dt.Columns.Add("ItemsInStock", typeof(double));

Imaging that there is 100 rows in that DataTable and you now want to calculate total price of all item currently in stock. The calculation is Price*ItemsInStock. What I see in a lot of code samples is that this column is calculated in the database by a SQL statement like this:

SELECT name, price, itemsinstock, (price*itemsinstock) AS stockprice FROM products”

The overhead in letting the database do the calculation is not that much in this particular example, because it is a simple multiplication of two rows. It could easily be more complicated than this example. The thing is, that .NET performs these kinds of calculation much more efficient than a database and that’s why we would like .NET to do them.

The DataTable class supports on-the-fly calculated columns and they are perfect to use in the example. Just add another column to the DataTable and give it a calculation formula.

dt.Columns.Add("StockPrice", typeof(double), "Price*ItemsInStock");

The calculation expression ("Price*ItemsInStock") can also use predefined functions like an if-statement.

"IIF(ItemsInStock = 0, 100, PricePrice*ItemsInStock)"

There a many different functions to use in the calculation expression.

Auto increment

Let’s say you want to bind the DataTable to a DataGrid in an ASP.NET page and that you want a column to display the row number. This can be done by adding a column to  the DataTable that has enabled the AutoIncrement property.

DataColumn col = new DataColumn("#", typeof(int));
col.AutoIncrement = true;
col.AutoIncrementSeed = 1;
dt.Columns.Add(col);

Now you have a column named “#” that contains the row number.

Querying the DataTable

You can query a DataTable in different ways in order to find the row you need. If you want all the rows in the DataTable that matches a search expression then you would use the Select method.

DataRow[] rows = dt.Select("Price > 159");

The Select method returns a DataRow array you can loop through like you normally would loop through all the rows in the DataTable.

foreach (DataRow row in rows)
{
   DoSomeThing();
}

If you just want a single row based on the DataTable’s primary key, then you have to let the DataTable know which of the columns is the primary key.

dt.PrimaryKey = new DataColumn[] { dt.Columns["#"]};

When you have defined the DataTable’s primary key, you can now query directly for that key and get the whole row returned by using the Find method.

DataRow oneRow = dt.Rows.Find("19");

This method is faster than the Select method. If there is no row with the primary key value of “19”, the Find method returns null. So, before you use the returned DataRow, you probably want to check if the row exist first.

if (oneRow != null)
{
   DoSomething();
}

Column totals

You decide to add totals to the footer row of the DataGrid and therefore needs to sum the integer type columns. You can do that very easy with the Compute method.

dt.Compute("sum(price)", null)

Or, put a filter on

dt.Compute("sum(price)", "price > 40")

The DataTable class is very powerful and can improve the performance by removing calculations to .NET instead of doing them on the database. The different ways to query the rows are also very impressive and flexible and that makes the DataTable a serious in-memory database.