Coding in London: Notes from DDD8a

The DeveloperDeveloperDeveloper event at the Microsoft offices in Reading was split in two tracks of simultaneous sessions.

Notes from some of the sessions:

It’s Time to Look at Entity Framework

Speaker: Julie Lerman (twitter @julielerman) thedatafarm.com

The Entity Framework was initially designed to reverse engineer existing databases but you can now also create them from scratch.

The designer generates a DDL. It doesn’t actually create the DB, it creates a script. Then you give the script to a DBA (Not sure how you specify paths for database and log files ?).

Support for POCO classes: your classes don’t have to inherit from Entity Framework classes, you can keep your model totally separate from EF and still work with the designer. The way you do this is through T4 templates, some of them you can download from Visual Studio Gallery.

If you tweak the T4 template you can totally abstract the EF layer (ObjectSet, etc…). In other words your code can interact with EF only through interfaces, which makes unit-testing possible.

TODO:

check out T4, CodeFirst

Packaging in the .NET World

Speaker: Seb Lambla (twitter @serialseb)

Seb demoed OpenWrap www.openwrap.org (the site is going live live in a few days), a packaging tool that you use to inject dependencies into your build. The dependencies must be available as “wraps” for this to be possible. You inject the dependencies with a simple command-line tool.

The openwrap commands can also be used from within an msbuild file.

There is another open source tool called Nupack (maintained by Microsoft) released on codeplex.

TODO:

Check out Scott Hanselman’s blog about Nupack

Is NoSQL the Future of Data Storage?

Speaker: Gary Short (twitter @garyshort)

Term introduced by a programmer from LastFM.
NoSQL
- often does not implement ACID
- avoids joins
- no fixed schema
- scales horizontally (adding more machines)
Types of NoSQL DBs
- document Store
- graph storage: nodes and edges
- key/value stores: on disk/on RAM
- “eventually consistent”
- object DBs. You provide your own indexing rules.
Good for:
- geographic regions, large quantities of data, game server sharding (what’s sharding?)
- often written, rarely read > Key/Value
- binary data
Example: Twitter
- they tried RDB > didn’t scale.
- built FlockDB
  - not optimized for transversal, because not needed.
  - optimized for adjacency lists: graph stored as set of edges
  - idempotency: useful for computing set unions and intersections.
- Lessons learned: use aggressive timeouts

TODO:

check story of Twitter (DB angle)

Modern C#: this is not your grand-daddy’s language

Speaker: Jon Skeet and his pony. (twitter: @jonskeet)

Inspiring talk about the evolution of the language features of C#.

Jon illustrated how powerful C# has got by writing a MaxBy() implementation in various versions of C#. MaxBy() should return the element of a collection containing the maximum value of a specified field. It should work with any field and any collection.

He wrote the solution in C# 4 use generics, lambdas, method extensions and inferred types.

In C# 1.0, he started with a simple for loop: this lead to code specific to the class being compared. To make the code a bit more generic he used a delegate and some downcasting from System.Object to the type compared: it worked but was somewhat verbose and not as type safe as the C# 4 version.

Jon’s quotes:

“Knowing how things work under the hood is important. However you need to be able to shift gears so that you think at a higher level when you need to”

“We need to be jolted out of the idea of what a language should look like.”

“You can express yourself more clearly without having to write things you don’t need. What needs to be expressed is expressed only once.”

“Learning F# helps you understand the new features of C# 4.”

Things you should know about SQL as a developer

Speaker: Simon Sabin (twitter: @simon_sabin)

Do not truncate the transaction log > you loose point-in-time recovery
Re-indexing
- You may just need to update the statistics.
Shrinking
- usually bad, files grow for a reason
- shrinking makes sense after a big import job only
Clustered indexes on dates
- make them small and unique
Types of joins
- Loop: for small datasets
- Hash: large datasets
- Merge: requires sets to be ordered
user-defined functions
- bad performance!
  - interpreted
  - can’t use parallelism
- demoed a padding operation done with UDF vs one done with a CLR function. The CLR one significantly faster.
- Prefer CLR functions over UDF.