Skip to content

Linq

Language Integrated Query


Linq allows querying data that are available in various different formats using a unified querying model.

We might query data in one of the following representations:

  • Relational data using SQL
  • In-Memory Data as object graphs
  • XML documents

Linq uses a declarative programming paradigm. You define what you want, not how to get there.


A Linq Query consists of 3 steps:

  1. Define a data source
  2. Create a query
  3. Execute the query

Linq heavily build on the IEnumerable and IQueryable interfaces. Whenever a data source implements one of these interfaces, Linq can be used to query the data behind it.

If a datasource doesn’t offer one of these interfaces, the data must be loaded into memory (e.g. into a List, Array, or similar that implements IEnumerable) to be able to query it.


// Step 1 - Define data source
List<Product> products = new List<Product> {
new Product("Coke", categoryId: 0),
new Product("Tea", categoryId: 0),
new Product("Apple", categoryId: 1)
}

Linq offers 2 different ways of creating a query

  • Query Expressions
  • Method Syntax

Query expressions look similar to SQL statements. These expressions are built into the C# language and converted by the compiler to method calls.

var query =
from product in products
join category in categories on
product.CategoryId equals category.Id
select new {
CategoryName = category.CategoryName,
Name = product.Name
}

LinqQueryExpression.excalidraw.svg


The same query can be defined using regular C# method calls and lambda expressions.

var query = products
.Join(
categories,
p => p.CategoryId,
c => c.Id,
(p, c) => new
{
CategoryName = c.CategoryName,
Name = p.Name
});

Depending on the underlying implementation of the data source, executing the query might mean completely different things.

E.g. when querying data stored in memory (e.g. List<Product>) it will use C# methods to create the result. When talking to a database using an IQueryable object, an SQL query will be generated instead and executed on the database (that’s called Linq to SQL btw.).


You can execute a query by enumerating the elements of the IEnumerable interface, for example using a foreach loop, or by calling ToList:

// Execute a query using a loop
foreach (var product in query) {
Console.WriteLine(
$"{product.Name} {product.CategoryName}");
}
// Execute a query using ToList
var result = query.ToList();

When defining a Linq query, no operation on the underlying data source will be triggered. We only specify what will happen, once we actually execute the query.

Only by enumerating the query, calling ToList / ToArray, or similar we actually perform the operations specified in the query.


SelectionProjection.excalidraw.svg

When using a where operation, what we do is actually a selection. We choose which of the available items should be included in the result.

When using a select operation, what we do is actually a projection. We project one representation of data on another representation. For example instead of querying projects, we only query the name for each project.


LinqSelect1.svg


LinqSelect2.svg


For projection you can also use anonymous classes.

IQueryable<Customer> customers = db.GetTable<Customers>();
var namePhoneQuery =
from cust in customers
where cust.City == "London"
select new
{
Name = cust.Name,
Phone = cust.Phone
};
foreach (var item in namePhoneQuery) {
Console.WriteLine($"{item.Name}: {item.Phone}");
}

We can use the where clause in Linq to restrict the data we want to return. This is called selection, or filtering.


IList<Student> studentList = new List<Student>() {
new Student
{
StudentId = 1,
StudentName = "John",
Age = 13
},
new Student
{
StudentId = 2,
StudentName = "Detlef",
Age = 21
}
};

// Query all students whose names have at least 5 characters
var result = from s in studentList
where s.Length >= 5
select s;
// Find all students aged between 13 and 19
var result = from s in studentList
where s.Age <= 19 && s.Age >= 13
select s;

Func<Student, bool> isTeenager =
(Student s) => s.Age <= 19 && s.Age >= 13;
// Find all students aged between 13 and 19
var result = from s in studentList
where isTeenager(s)
select s;

Use from to define the data source of your query.

var result = from s in studentList
select s.StudentName.Substring(0, 1);

Use OrderBy to sort the elements of your query.

string[] words = [
"the", "quick", "brows", "fox", "jumps"
];
IEnumerable<string> query = from word in words
orderby word.Length
select word;

You can also define the order (ascending or descending). If not specified, ascending sorting will be used.

string[] words = [
"the", "quick", "brows", "fox", "jumps"
];
IEnumerable<string> query = from word in words
orderby word.Length descending
select word;

You can use different forms of join operations with Linq.

from ... in <outerSequence>
join ... in <innerSequence>
on <outerKey> equals <innerKey>
select ...

// Step 1: Define a data source
List<Product> products = new List<Product> {
new Product("Apple", 1),
new Product("Coke", 0),
new Product("Tea", 0)
};
List<Category> categories = new List<Category> {
new Category(0, "Beverage"),
new Category(1, "Fruit"),
new Category(2, "Vegetable")
};

// For all products, find the product name and category name
var result =
from p in products
join c in categories
on p.CategoryId equals c.Id
select new {
CategoryName = c.CategoryName,
Name = p.Name
}

// Define a data source
class Person {
public string FirstName { get; set; }
public string LastName { get; set; }
}
class Pet {
public string Name { get; set; }
public Person Owner { get; set; }
}

Linq - Inner Join - Object Association (2)

Section titled “Linq - Inner Join - Object Association (2)”
// For all pets, find the pet's name and the owner's name
var query =
from p in people
join pet in pets
on p equals pet.Owner
select new {
OwnerName = p.FirstName,
PetName = pet.Name
}

// Define a data source
List<Employee> employees = new List<Employee> {
new Employee(
"Terry", "Adams", 522459
)
};
List<Student> students = new List<Student> {
new Student(
"Vernette", "Price", 9562
)
};

// Define a query
var result =
from e in employees
join s in students
on new {
e.FirstName, e.LastName
}
equals new {
s.FirstName, s.LastName
}
select e.FirstName + " " + e.LastName;

// Define a data source
class Person {
public string FirstName { get; set; }
public string LastName { get; set; }
}
class Pet {
public string Name { get; set; }
public Person Owner { get; set; }
}
class Cat : Pet { }
class Dog : Pet { }

// Define a query
var query =
from p in people
join c in cats
on p equals c.Owner
join d in dogs
on new {
Owner = p,
Letter = c.Name.Substr(0, 1)
}
equals new {
d.Owner, Letter = d.Name.Substr(0, 1)
}
select new {
CatName = c.Name,
DogName = d.Name
}

Partitioning operators are used to partition the result of a query.

LinqPartitioning.excalidraw.svg


  • Operators for Partitioning:
    • Skip
    • SkipWhile
    • Take
    • TakeWhile

Skip ignores the first n elements.

IEnumerable<string> query = select ... ;
query.Skip(5);

SkipWhile ignores elements at the beginning of the data source as long as they fulfil a certain condition.

IEnumerable<string> query = {59, 82, 70, 56, 92};
query.SkipWhile(grade => grade >= 80);

Take limits the result to a certain amount of elements.

IEnumerable<string> query = {59, 82, 70, 56, 92};
query.Take(3);

Aggregation Operators can be used to find aggregates of multiple data entries.

LinqAggregation.excalidraw.svg


  • Aggregation Operators:
    • Min, Max, Average
    • Count, Sum
    • Aggregate

// Compute the sum of the elements of an array
var numbers = new int[] { 4, 56, 2, 5, 43, 5 };
int sum = (from n in numbers).Sum();
// Compute the sum of the elements of an array
var numbers = new int[] { 4, 56, 2, 5, 43, 5 };
int sum = numbers.Sum();

// Compute the sum of the elements of an array
var numbers = new int[] { 4, 5, 3, 9 }j
int sum = numbers.Aggregate(
(result, item) => result + item
);

LinqAggregation2.excalidraw.svg


Use tho group clause to group data.

LinqGroup.excalidraw.svg


List<int> numbers = new List<int> {
35, 44, 200, 84, 3987, 4, 199, 446
};
// Group the list of numbers into even and odd numbers
IEnumerable<IGrouping<int, int>> query =
from n in numbers group n by n % 2;
List<int> numbers = new List<int> {
35, 44, 200, 84, 3987, 4, 199, 446
};
// Group the list of numbers into even and odd numbers
IEnumerable<IGrouping<int, int>> query =
numbers.GroupBy(n => n % 2);

foreach (var group in query) {
Console.WriteLine(
group.Key == 0
? "Even numbers"
: "Odd numbers")
foreach (var i in group) {
Console.WriteLine(i);
}
}

List<Student> students = new List<Student> {
new {
"Terry", "Adams", 120, GradeLevel.SecondYear
},
new {
"Fadi", "Fakhouri", 120, GradeLevel.ThirdYear
}
};

Group elements by a single property.

var lastNames =
from s in students
group s by s.LastName
into newGroup // grouped data
orderby newGroup.Key
select newGroup;
var lastNames = students
.GroupBy(s => s.LastName)
.OrderBy(g => g.Key);

Group elements by a derived value.

var groupResults =
from s in students
group s by s.LastName[0]
var groupResults = students
.GroupBy(s => s.LastName[0]);
foreach (var studentGroup in groupResults) {
foreach (var s in studentGroup) {
Console.WriteLine(
$"{s.LastName} {s.FirstName}"
);
}
}

var query =
from s in students
group s by new {
FirstLetter = s.LastName[0],
Score = s.ExamScore > 85
}
into studentGroup // grouped data
orderby studentGroup.Key.FirstLetter
select studentGroup;
var query = students
.GroupBy(s => new {
FirstLetter = s.LastName[0],
Score = s.ExamScore > 85
})
.OrderBy(g => g.Key.FirstLetter);

var query =
from s in students
group s by new {
FirstLetter = s.LastName[0],
Score = s.ExamScore > 85
}
into studentGroup
where studentGroup.Count() > 3
select studentGroup
var query = students
.GroupBy(s => new {
FirstLetter = s.LastName[0],
Score = s.ExamScore > 85
})
.Where(g => g.Count() > 3)
.OrderBy(g => g.Key.FirstLetter);

var players = new List<Player> {
new Player(name: "Alex", team: "A", score: 10),
new Player(name: "Anna", team: "A", score: 20),
new Player(name: "Luke", team: "L", score: 60),
new Player(name: "Lucy", team: "L", score: 40)
};

var scores =
from p in players
group p by p.Team into playerGroup
select new {
Team = playerGroup.Key,
TotalScore = playerGroup.Sum(
x => x.Score
)
};
var scores = players
.GroupBy(p => p.Team)
.Select(g => new {
Team = playerGroup.Key,
TotalScore = g.Sum(x => x.Score)
});

To further specify the data source of a query you can use set operations such as:

  • Distinct
  • Except
  • Intersect
  • Union

The Distinct method removes duplicates from the data source.

LinqDistinct.excalidraw.svg


string[] planets = {
"Mercury", "Venus", "Venus", "Earth", "Mars", "Earth"
};
var query =
from planet in planets.Distinct()
select planet;
var query = planets.Distinct();

LinqExcept.excalidraw.svg


string[] planets1 = {
"Mercury", "Venus", "Earth", "Jupiter"
};
string[] planets2 = {
"Mercury", "Earth", "Mars", "Jupiter"
};
var query =
from planet in planets1.Except(planets2)
select planet;
var query = planets1.Except(planets2);

Intersect finds elements that appear in both data sources.

LinqIntersect.excalidraw.svg


string[] planets1 = {
"Mercury", "Venus", "Earth", "Jupiter"
};
string[] planets2 = {
"Mercury", "Earth", "Mars", "Jupiter"
};
var query =
from planet in planets1.Intersect(planets2)
select planet;
vat query = planets1.Intersect(planets2);

Union combines two data sources to one, including all elements from both collections.

LinqUnion.excalidraw.svg

Union will also remove duplicates as long as the items can be compared using the overriden Equals method or an comparer object that implements the IEquatable<T> interface.


string[] planets1 = {
"Mercury", "Venus", "Earth", "Jupiter"
};
string[] planets2 = {
"Mercury", "Earth", "Mars", "Jupiter"
};
var query =
from planet in planets1.Union(planets2)
select planet;
var query = planets1.Union(planets2);

Quantifiers classify a set of values. The return value of a quantifier is a boolean value.

  • Operators:
    • All
    • Any
    • Contains

The All quantifier checks, whether each and everyone of the elements of a data source fulfils a condition.

List<Market> markets = new List<Market> {
new Market {
Name = "Emiliy",
Items = new string[]{"kiwi", "cherry", "banana"}
},
new Market {
Name = "Kim",
Items = new string[]{"melon", "mango", "olive"}
}
};

// Which market only offers fruits
// of which the name is at least 5 characters long
var query =
from m in markets
where m.Items.All(item => item.Length >= 5)
select m.Name;
var query = markets
.Where(m => m.Items.All(item => item.Length >= 5));

The Any quantifier checks, whether at least one element fulfils a condition.

// Which market sells at least one kind of fruit
// that starts with the letter o.
var query =
from m in markets
where m.Items.Any(item => item.StartsWith("o"))
select m.Name;
var query = markets
.Where(m => m.Items.Any(item => item.StartsWith("o")));

The Contains quantifier checks whether a certain element exists in the data source.

// Which market sells at least one fruit that starts with o
var query =
from m in markets
where m.Items.Contains("kiwi")
select m.Name;
var query f markets
.Where(m => m.Items.Contains("kiwi"));

Subqueries are queries that are embedded into other queries.

var players = new List<Player> {
new Player(name: "Alex", team: "A", score: 10),
new Player(name: "Anna", team: "A", score: 20),
new Player(name: "Luke", team: "L", score: 60),
new Player(name: "Lucy", team: "L", score: 40)
};

// Which players got the most points?
var query =
from p in players
where p.Points ==
(from p1.Points in players).Max()
select p;
var query = players
.Where(p => p.Points == players.Max(p1 => p1.Points));