0% found this document useful (0 votes)
46 views9 pages

EntityFramework Tips PDF

1. The document provides tips for using Entity Framework Core more efficiently, including using AsNoTracking to avoid change tracking overhead, using SQLite for in-memory testing, and loading related data asynchronously using LoadAsync to avoid timeouts. 2. It recommends explicitly defining string length when modeling to avoid NVARCHAR(MAX) and performance issues, and to split queries using SplitQuery to avoid Cartesian explosions from joins. 3. The tips aim to improve performance and memory usage, especially for large datasets, by leveraging features like asynchronous loading, avoiding unnecessary tracking, and splitting complex queries.

Uploaded by

rjsinha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views9 pages

EntityFramework Tips PDF

1. The document provides tips for using Entity Framework Core more efficiently, including using AsNoTracking to avoid change tracking overhead, using SQLite for in-memory testing, and loading related data asynchronously using LoadAsync to avoid timeouts. 2. It recommends explicitly defining string length when modeling to avoid NVARCHAR(MAX) and performance issues, and to split queries using SplitQuery to avoid Cartesian explosions from joins. 3. The tips aim to improve performance and memory usage, especially for large datasets, by leveraging features like asynchronous loading, avoiding unnecessary tracking, and splitting complex queries.

Uploaded by

rjsinha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Intro

This small PDF gives some tips and tricks with regards to Entity Framework. The examples were taken from "Tips and tricks". The following tips
are in this PDF:
Don't track readonly entities
In-Memory database with SQLite
LoadAsync - Split queries into smaller chunks

Define maximum length for strings


Split multiple queries - SplitQuery
Don't track readonly entities
When loading models from the database Entity Framework will create proxies of your object for change detection. If you have load objects, which
will never get updated this is an unnecessary overhead in terms of performance but also allocations as those proxies have their own memory
footprint.
❌ Bad Change detection proxies are created even though it is only meant for reading.
return await blogPosts.Where(b => b.IsPublished)
.Include(b => b.Tags)

.ToListAsync();

✅ Good Explicitly say that the entities are not tracked by EF.
return await blogPosts.Where(b => b.IsPublished)
.Include(b => b.Tags)

.AsNoTracking()

.ToListAsync();

Benchmark
A detailed setup and benchmark (which is referred below) can be found on the official https://docs.microsoft.com/en-
us/ef/core/performance/efficient-querying#tracking-no-tracking-and-identity-resolution.
| Method | Blogs | PostsPer | Mean | Error | StdDev | Median | Ratio | Gen 0 | Gen 1 | Allocated |

|------------- |------ |--------- |-----------:|---------:|---------:|-----------:|------:|--------:|--------:|----------:|

| AsTracking | 10 | 20 | 1,414.7 us | 27.20 us | 45.44 us | 1,405.5 us | 1.00 | 60.5469 | 13.6719 | 380.11 KB |

| AsNoTracking | 10 | 20 | 993.3 us | 24.04 us | 65.40 us | 966.2 us | 0.71 | 37.1094 | 6.8359 | 232.89 KB |

In-Memory database with SQLite


SQLite can be used as an in-memory database for your code. This brings big advantage in testing. The database is transient, that means as soon
as the connection gets closed the memory is freed. One downside is that the in-memory database is not thread-safe by default. This is achieved
with the special :memory: data source.
The advantage over the In-Memory database package provided via
Microsoft.EntityFrameworkCore.InMemory that the SQLite version behaves closer to a real rational database. Also Microsoft disencourages the
use of the InMemory provider.
var connection = new SqliteConnection("DataSource=:memory:");

connection.Open();

services.AddDbContext<MyDbContext>(options =>

options.UseSqlite(connection);

});

To make it work with multiple connections at a time, we can utilize the cache=shared identifier for the data source. More information can be
found on the official website.
var connection = new SqliteConnection("DataSource=myshareddb;mode=memory;cache=shared");

connection.Open();

services.AddDbContext<MyDbContext>(options =>

options.UseSqlite(connection);

});

The database gets cleaned up when there is no active connection anymore.


💡 Info: You have to install the Microsoft.EntityFrameworkCore.Sqlite package to use the UseSqlite method.
LoadAsync - Split queries into smaller chunks
LoadAsync in combination with a EF-Core DbContext can load related entities. That is useful when handling with big data sets, cartiasian
explosion (wanted or unwanted), lots of joins or unions. Big data sets for the reason, that it can happen that the timeout will be reached easily. As
with LoadAsync the query is split up into smaller chunks, those might not hit the timeout individually.
This code:
return await _context.Books

.Include(b => b.BookCategories)

.ThenInclude(c => c.Category)

.Include(c => x.Author.Biography)

.ToListAsync();

Will be roughly translated to the following SQL statement:


SELECT [b].[Id], [b].[AuthorId], [b].[Title], [a].[Id], [a0].[Id], [t].[BookId], [t].[CategoryId], [t].[Id],

[t].[CategoryName], [a].[FirstName], [a].[LastName], [a0].[AuthorId], [a0].[Biography], [a0].[DateOfBirth],

[a0].[Nationality], [a0].[PlaceOfBirth]

FROM [Books] AS [b]

INNER JOIN [Authors] AS [a] ON [b].[AuthorId] = [a].[Id]

LEFT JOIN [AuthorBiographies] AS [a0] ON [a].[Id] = [a0].[AuthorId]

LEFT JOIN (

SELECT [b0].[BookId], [b0].[CategoryId], [c].[Id], [c].[CategoryName]

FROM [BookCategories] AS [b0]

INNER JOIN [Categories] AS [c] ON [b0].[CategoryId] = [c].[Id]

) AS [t] ON [b].[Id] = [t].[BookId]

ORDER BY [b].[Id], [a].[Id], [a0].[Id], [t].[BookId], [t].[CategoryId]

With LoadAsync :
var query = Context.Books;

await query.Include(x => x.BookCategories)

.ThenInclude(x => x.Category).LoadAsync();

await query.Include(x => x.Author).LoadAsync();

await query.Include(x => x.Author.Biography).LoadAsync();

return await query.ToListAsync();

Which will be translated into:


SELECT [b].[Id], [b].[AuthorId], [b].[Title], [t].[BookId], [t].[CategoryId], [t].[Id], [t].[CategoryName]

FROM [Books] AS [b]

LEFT JOIN (

SELECT [b0].[BookId], [b0].[CategoryId], [c].[Id], [c].[CategoryName]

FROM [BookCategories] AS [b0]

INNER JOIN [Categories] AS [c] ON [b0].[CategoryId] = [c].[Id]

) AS [t] ON [b].[Id] = [t].[BookId]

ORDER BY [b].[Id], [t].[BookId], [t].[CategoryId]

SELECT [b].[Id], [b].[AuthorId], [b].[Title], [a].[Id], [a].[FirstName], [a].[LastName]

FROM [Books] AS [b]

INNER JOIN [Authors] AS [a] ON [b].[AuthorId] = [a].[Id]

SELECT [b].[Id], [b].[AuthorId], [b].[Title], [a].[Id], [a].[FirstName], [a].[LastName],

[a0].[Id], [a0].[AuthorId], [a0].[Biography], [a0].[DateOfBirth], [a0].[Nationality], [a0].[PlaceOfBirth]

FROM [Books] AS [b]

INNER JOIN [Authors] AS [a] ON [b].[AuthorId] = [a].[Id]

LEFT JOIN [AuthorBiographies] AS [a0] ON [a].[Id] = [a0].[AuthorId]

SELECT [b].[Id], [b].[AuthorId], [b].[Title]

FROM [Books] AS [b]

Define maximum length for strings


When creating a SQL database via code first it is important to tell EF Core how long a string can be otherwise it will always be translated to
NVARCHAR(MAX) . This has performance implications as well as other problems like not being able to create an index on that column. Also a rogue
application could flood the database.
Having this model:
public class BlogPost

public int Id { get; private set; }

public string Title { get; private set; }

public string Content { get; private set; }

❌ Bad Not defining the maximum length of a string will lead to NVARCHAR(max) .
public class BlogPostConfiguration : IEntityTypeConfiguration<BlogPost>

public void Configure(EntityTypeBuilder<BlogPost> builder)

builder.HasKey(c => c.Id);

builder.Property(c => c.Id).ValueGeneratedOnAdd();

Will lead to generation of this SQL table:


CREATE TABLE BlogPosts

[Id] [int] NOT NULL,

[Title] [NVARCHAR](MAX) NULL,

[Content] [NVARCHAR](MAX) NULL


)

✅ Good Defining the maximum length will reflect also in the database table.
public class BlogPostConfiguration : IEntityTypeConfiguration<BlogPost>

public void Configure(EntityTypeBuilder<BlogPost> builder)

builder.HasKey(c => c.Id);

builder.Property(c => c.Id).ValueGeneratedOnAdd();

// Set title max length explicitly to 256

builder.Property(c => c.Title).HasMaxLength(256);

Will lead to generation of this SQL table:


CREATE TABLE BlogPosts

[Id] [int] NOT NULL,

[Title] [NVARCHAR](256) NULL, -- Now it only can hold 256 characters

[Content] [NVARCHAR](MAX) NULL


)

Split multiple queries - SplitQuery


The basic idea is to avoid "cartesian explosion". A cartesian explosion is when performing a JOIN on the one-to-many relationship then the rows
of the one-side are being replicated N times (N = amount of rows on the many side).
With SplitQuery instead of having 1 query, you will now have 2 queries where first the "one" side is loaded and in a separate query the "many"
part is loaded. Where the SplitQuery can bring improvement it also has some major drawbacks.
1. You go two times to your database instead of once.
2. From the database point of view these are two separate queries. So no guarantee of data consistency. There could be race conditions
interfering with your result set.
❌ Bad Every include is resolved by a LEFT JOIN leading to duplicated entries.
var blogPosts = await DbContext.Blog

.Include(b => b.Posts)

.Include(b => b.Tags)

.ToListAsync();

✅ Good Get all related children by a separate query which gets resolved by an INNER JOIN .
var blogPosts = await DbContext.Blog

.Include(b => b.Posts)

.Include(b => b.Tags)

.AsSplitQuery();

.ToListAsync();

💡 Info: There are database which support multiple result sets in one query (for example SQL Server with Multiple Active Result Set). Here
the performance is even better. For more information checkout the official Microsoft page about SplitQuery .

You might also like