Monday, 20 July 2015

Database + Application = Philosophy

Scenario: You want a database and an application. Sounds simple - and extremely common.

It should be simple, but too often I find that it is an over complicated part of the system - sometimes by the tools which claim to make it easier. i.e. Object Relational Mapper's (ORM's)

I have experience making a custom ORM from scratch as well as using off the shelf products: Entity framework, NHibernate and Subsonic.

Needless to say I am coming from the Microsoft stack - but these argument apply to whatever stack you are into.

They save you time by automatically implementing the data access layer. But I believe that it comes at a cost.

Why? Ultimately, it feels like there is an element of hype when it comes to using libraries. Inexperience will lead you to believe that they will solve all of your problems.
  • What is Microsoft/JQuery/etc doing now? Whatever it is, it must be awesome! It must be the next big thing!
  • What is the most downloaded library on Nuget? Everyone must be using it for a reason! I must have it too! I dont want to miss out!
  • A library that does loads of stuff for me? that MUST be good!
  • I won't need to touch the database if i use this library! WOW.
  • I can get my application up and running faster!
  • It can generate database upgrade scripts for me!
  • I get compile time checking for my queries! My application won't break anymore!
  • I can easily swap my database to another type!
  • If I plug in these libraries it will promote good design in my software! After all, we are taught that everything must be designed properly, with nice diagrams and glossy documentation.
With these arguments going on its easy to get caught up in the hype.
There is of course truth to all of the above points, but... 
  • Learning the ORM - takes time in itself - these libraries can be big! They are designed to work with different database types (SQL Server, Oracle, MySQL), multiple query options, configurable with XML files, edmx files or even more libraries like Fluent NHibernate. 
  • Code bloat - you need lots more config.
  • Unless you take care, you will end up being coupled to the library
  • There will be bugs and features missing - you may end up waiting on 3rd party release schedules.
  • You will have a large dependency on the library - which can make breaking changes with new versions.
  • The SQL generated and sent to the server can be unpredictable. You will need to intercept these and check what is running being run on the database.
  • Upgrade scripts generated by an ORM should be treated with extreme care. Only in the most trivial cases would I allow this to be executed and only then after a complete review.
  • ORMs mean you don't need to write stored procedures
    • less use of execution plans & slower performance.
    • Can't secure the database as easily. 
    • Cant develop or test database actions in isolation.
    • Make changes (within reason) without having to rebuild the application
  • You will avoid learning SQL. This would be a crime. SQL is so useful and isn't that hard either. It's more universal and stable than the ORMs. If your SQL is written by hand, you can make it efficient and easy to understand. If you do get problems with a query, it will be a lot easier to debug and fix manually written SQL than ORM generated SQL.
  • Your code will more likely become infected with Linq (or equivalent). Can obviously be prevented by imposing the right design - but the temptation maybe too great.
  • You will most likely still need stored procedures - for complex queries where linq etc just can't cut the mustard.
  • Are you really going to change your database mid project? How long have you used your database system - are you actually thinking of scrapping it? I doubt it. That's potentially years of investment in the nuances of a single database type. Rewriting data access isn't (relatively) such a big deal if you are swapping the database. 
  • How many tables do you have? is it really going to take that long to create a custom data access layer?
Essentially:

  • More is not always better.
  • If you are involved in data access, do yourself a favour: Learn ADO.NET and SQL.
  • Treat database development as a completely separate role to application development. They are different beasts and I believe it will be a few years before we can control both using a single methodology. 
  • ORM's still have a place. Don't discount them - just know when to use them and what their limitations are.

Whilst writing this blog post, I discovered blogs discussing the same subject:

  • http://rlacovara.blogspot.co.uk/
  • http://www.vertabelo.com/blog/technical-articles/orms-under-the-hood
More ORM related links:
  • http://russellscottwalker.blogspot.co.uk/2013/10/active-record-vs-data-mapper.html
  • https://dzone.com/articles/martin-fowler-orm-hate


Tuesday, 12 May 2015

Email query results from a SQL Server Agent Job step

I recently discovered sp_send_dbmail. Finally I have an easy way to send the results of overnight scheduled tasks to my inbox. However there were a few unexpected things that came up...

I am assuming that you have setup database mail? If so, then please continue.

So lets imagine that this is your TSQL job step. The query is trivial (but is actual quite useful when learning about who is running your SQL Agent jobs. Ill leave that to another blog post though.)

EXEC msdb.dbo.sp_send_dbmail
@profile_name = 'Default',
@recipients = 'your@email.com',
@subject = 'Sent from SQL Agent Job',
@execute_query_database = 'DB_TO_RUN_QUERY_ON',
@query = N'
SELECT  
ORIGINAL_LOGIN() AS [ORIGINAL LOGIN]
,SUSER_SNAME() AS [EFFECTIVE LOGIN]
,USER_NAME() AS [DB USER]',
@attach_query_result_as_file = 0;

Under the advanced page of your job step, you need to set Run As User to be a user that is a member of the DatabaseMailUser database role in the msdb database. For ease of use and to get things working I chose 'dbo' If you dont do this, then the TSQL will run under the SQL Agent service - and by default, this wont be a member of DatabaseMailUser database role in the msdb database.

The next thing was to set the database drop down list field to be msdb. 
I thought this was totally unexpected because I had already fully qualified the stored procedure with DB and schema. i.e. msdb.dbo.
And no, it makes no difference if you put a 'USE msdb;' at the top of your TSQL job step! 
MSDB MUST be set on the database drop down list. 

Tuesday, 21 April 2015

SSIS Custom Components and a Tale of Versions


I have been investigating SSIS custom components. These will allow non programmers in my team to use SSIS components which will perfectly fit our business data needs. e.g. custom data conversion. 
I am assuming that you know how to make an SSIS custom component?

  1. Create a class library: 
    1. Reference some dlls from the SQL Server Client Tools SDK.
    2. Inherit from PipelineComponent.
    3. Override some methods.
    4. See this for a starting point: https://msdn.microsoft.com/en-us/library/ms135993.aspx
  2. Take your newly built custom component:
    1. Copy it into a SQL Server installation folder.
    2. Registered into the GAC.
  3. Open your SSIS package IDE and register the component/refresh your toolbox.
  4. Your SSIS toolbox will then hopefully show this component.
  5. Bish bash bosh!
Hmmm....


I had a few issues getting this to work though because I had so many environment version 'variables' to deal with. For example:
  • Architecture - 32bit or 64bit? (of SQL Server/.Net/gacutil)
  • .Net framework - 2.0, 3.5, 4.0, 4.5 or 4.51?
  • Gacutil version?
  • SQL Server - 2008 or 2014?
  • SSIS Version - Business Intelligence Development Studio (BIDs) or Data tools for Visual Studio 2013?
What has made my situation more complicated is that I am running two versions of SQL Server: 2008 (for production) and 2014 (for development - getting ready for production to be upgraded). 
I am also running 64 bit version of SQL Server/Windows 7 and so my installation folders are across Program Files and Program Files (x86). On top of this, when I installed 2014 I didn't include the Client Tools SDK feature - so for a while I was missing the latest dlls - I wasn't even sure there were later dlls as my installation looked like it had everything...

On that note - you need to know about your SQL Server installation folders. Within the Microsoft SQL Server installation folder you will find lots of numbered folders. The folders relate to the versions of SQL Server installed. For example:
  • 80 = SQL Server 2000
  • 90 = SQL Server 2005
  • 100 = SQL Server 2008
  • 105 = SQL Server 2008 R2
  • 110 = SQL Server 2012
  • 120 = SQL Server 2014
These numbered folders can exist in both Program Files and Program Files (x86) installation folders. However, just because you have the folder, it doesn't mean you that have any or all of the features installed for that version. Folders are created regardless for backward compatibility reasons - e.g. access to older versions of DLL's & tools.

Ok, Now what?

Below are the dlls that you need to reference from your custom component dll. They come from the SQL Server Client Tools SDK, They are not listed under the .net framework assemblies (In Visual Studio 'Add Reference' dialog). You will need to manually browse to this directory and add them. 
  • Microsoft.SqlServer.DTSPipelineWrap.dll
  • Microsoft.SQLServer.DTSRuntimeWrap.dll
  • Microsoft.SQLServer.ManagedDTS.dll
  • Microsoft.SqlServer.PipelineHost.dll
With all this said, we have two routes for creating a custom component:

SQL Server 2005/2008 & BIDs 

  • Reference dlls from:
    • C:\Program Files (x86)\Microsoft SQL Server\100\SDK\Assemblies\
  • Copy your custom component dll to: 
    • C:\Program Files (x86)\Microsoft SQL Server\100\DTS\PipelineComponents\
  • Your custom component should target .Net 2.0, 3.0 or 3.5.
  • Use SQL Server Business Intelligence Development Studio to make your SSIS packages.
  • Ensure that the SQL Server 2005/2008 Client Tools SDK feature is fully installed.
  • Use gacutil from .Net 2.0, 3.0 or 3.5 or higher to register your component dll.
  • Add component to BIDs toolbox by right clicking the tool box and select 'Choose Items...'. Find your component under the 'SSIS Data Flow Items' or 'SSIS Control Flow Items' tab (depending on what type of component you have created). Check the component checkbox. Click OK.

SQL Server 2012/2014 & Data Tools

  • Reference dlls from:
    • C:\Program Files (x86)\Microsoft SQL Server\120\SDK\Assemblies\
  • Copy your custom component dll to: 
    • C:\Program Files (x86)\Microsoft SQL Server\120\DTS\PipelineComponents\
  • Your custom component should target .Net 4.0 or higher.
  • Use SQL Server Data Tools for Visual Studio to make your SSIS packages.
  • Ensure that the SQL Server 2012/2014 Client Tools SDK feature is fully installed.
  • Use gacutil from .Net 4.0 or higher to register your component dll.
  • Add component to Data Tools toolbox simply by right clicking the tool box and select 'Refresh Toolbox'.


Hope this helps you through a bad day!


Friday, 17 April 2015

Basic But Effective Backup Software


After a few years of using vice versa backup software, and attempting to use Acronis (urgh) I just discovered Windows Robocopy!

Its been built into Windows for a while (doh!) and works really well (hooray!)

I prefer to have two separate harddisks running on my home server. D for data and E for backup. Both are 2TB (they died) 4TB in size. I dont want mirroring because I dont want to automatically mirror any mistakes I might make. I don't like big bulky and bloated backup software. I don't mind one liner command prompts... I also like free stuff.

This is how I did it:

Put this in a batch file (on a single line):

robocopy D:\ E:\ /MIR /XD "D:\$RECYCLE.BIN" "D:\System Volume Information" /R:10 /W:10 /LOG+:"D:\RoboCopyLog.txt"

Where:

D:\ = source
E:\ = destination
/MIR = mirror everything
/XD = exclude directories
/R = number of retries if failure
/W = seconds to wait between retries
/LOG+ = append to log file

Run the batch file from windows task scheduler e.g. every night at 4am. It will need to be run under an administrator account.

I have about 1.5TB of data to backup - it only takes approx 30 seconds to run if there were no changes. Sweet!

Cheap and effective. Hope it can help you...