Collation Oddity - can anyone shed any light?

Discussion:

(too old to reply)

Rob Meade

2012-08-16 09:37:50 UTC

Hi all,

I've used SQL Server for several years, but there is loads of it I will confess to not truly understanding, hence my post here, hoping that someone can help out.

We have two SQL Servers set up, using mirroring. Today we finally got around to testing the "Failover Partner" connection string attribute within a .net application and found that it errored because of a non-declared stored procedure parameter.

Turns out it was declared, but it was declared as @userID and then later used in the WHERE clause as @userId (note the lower case 'd').

So, case-sensitivity seems to be the issue (as well as a not very well checked stored procedure!)..

I've looked at the database that is mirrored and it is set to LATIN1_GENERAL_CS_AI. But here's the weird bit. The primary server is set to LATIN1.GENERAL_CI_AS, and the secondary(failover) server is set to LATIN1_GENERAL_CS_AI

So, our servers are not configured the same - I'm still ok with this so far, and changing the server collation on the secondary(failover) should resolve this (I hope), but the bit that's confusing me is that I thought that the 'database' collation setting would over ride the default of the server, but it seems that the server collation is over riding the database's?

Any one got any thoughts? I can give more details if needed...

Kind regards

Rob Meade

Jeroen Mostert

2012-08-16 20:31:00 UTC

Permalink

Post by Rob Meade
We have two SQL Servers set up, using mirroring. Today we finally got
around to testing the "Failover Partner" connection string attribute
within a .net application and found that it errored because of a
non-declared stored procedure parameter.

OK, stop here for a bit. Did you test if the client failed *only* on the
failover partner and not on the primary? Because, for reasons I'll detail
below, I feel it really ought to fail on both. If it doesn't, you've found
something interesting.

Post by Rob Meade
So, case-sensitivity seems to be the issue (as well as a not very well
checked stored procedure!)..

Well... it's subtle.

Even if the database is using a case-sensitive collation, stored procedure
parameter names are not case sensitive within the stored procedure, at least
not if the server collation is case-insensitive (at least in my tests; I had
no opportunity to try it with a case-sensitive server collation). So the
stored procedure is not at fault.

However, when *calling* the stored procedure, if the database is using a
case-sensitive collation, you need to use the correct case on the parameter
names. This applies both when executing the stored procedure directly and
when using sp_executesql. If I'm correct, you should find that executing the
stored procedure works fine if you spell @userID exactly as it is declared,
even if the WHERE spells it differently within the procedure.

Post by Rob Meade
I've looked at the database that is mirrored and it is set to
LATIN1_GENERAL_CS_AI.

That alone should explain why the stored procedure call fails regardless of
server collation. A case-sensitive collation on your database is supremely evil.

Alright, not *supremely* evil, but very awkward for most use cases.

Post by Rob Meade
But here's the weird bit. The primary server is set to
LATIN1.GENERAL_CI_AS, and the secondary(failover) server is set to
LATIN1_GENERAL_CS_AI
So, our servers are not configured the same - I'm still ok with this so
far, and changing the server collation on the secondary(failover) should
resolve this (I hope), but the bit that's confusing me is that I thought
that the 'database' collation setting would over ride the default of the
server, but it seems that the server collation is over riding the
database's?

"Changing" the server collation actually entails dropping all user databases
and rebuilding the system databases. If the second server is recently new,
you may as well redo the installation.

Collation is set when you first create a database. The collation is taken
from the server collation if you don't specify one. From that point on,
every character column you create inherits the database collation. Collation
doesn't cascade with defaults -- it's set on object creation. So no, the
server collation won't override anything. Or shouldn't, at least.

Now, back to your original question. Mirroring works by synchronously
committing transactions from a known identical restore. It ought to be
impossible for collation differences to arise under a mirroring situation,
at least within the database. What exactly happens with stored procedure
parameters is interesting.

--
J.

Erland Sommarskog

2012-08-16 20:48:03 UTC

Permalink

Post by Jeroen Mostert
Even if the database is using a case-sensitive collation, stored
procedure parameter names are not case sensitive within the stored
procedure, at least not if the server collation is case-insensitive (at
least in my tests; I had no opportunity to try it with a case-sensitive
server collation). So the stored procedure is not at fault.

Obviously the procedure is not a fault, as the error is in the application.

But variables always follow the server collation. Consider this script:

CREATE DATABASE CI COLLATE Latin1_General_CI_AS
go
USE CI
go
CREATE PROCEDURE sp AS
DECLARE @i int = 1
DECLARE @I int = 2
go
EXEC sp
go
CREATE PROCEDURE sp1 AS
DECLARE @i int = 1
SELECT @I = 2
go
EXEC sp2
gi
USE tempdb
go
DROP DATABASE CI

I get this output when I run it on my server which has Finnish_Swedish_CS_AS
as the server collation:

Msg 137, Level 15, State 1, Procedure sp1, Line 3
Must declare the scalar variable "@I".
Msg 2812, Level 16, State 62, Line 1
Could not find stored procedure 'sp2'.

That is, the first procedure compiles, the second fails to compile.

Post by Jeroen Mostert
A case-sensitive collation on your database is supremely evil.
Alright, not *supremely* evil, but very awkward for most use cases.

As this exercise shows, there is a use-case where case-sensitive
collations are essential: development environments. In my opinion,
you should always develop in a case-sensitive environment, exactly
to avoid this sort of problems.

In SQL 2012, there is a new feature that evades this problem: contained
databases. In a contained database, the collation for metadata is always the
one and the same, Latin1_General_100_CI_AS_KS_WS_SC.

--
Erland Sommarskog, SQL Server MVP, ***@sommarskog.se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx

Gene Wirchenko

2012-08-16 21:09:28 UTC

Permalink

On Thu, 16 Aug 2012 22:48:03 +0200, Erland Sommarskog

Post by Erland Sommarskog

Obviously the procedure is not a fault, as the error is in the application.
CREATE DATABASE CI COLLATE Latin1_General_CI_AS
go
USE CI
go
CREATE PROCEDURE sp AS
go
EXEC sp
go
CREATE PROCEDURE sp1 AS

Post by Erland Sommarskog
go
EXEC sp2

^
This should be "1"

Post by Erland Sommarskog
gi

^^
Not the best spelling of "go".

Post by Erland Sommarskog
USE tempdb
go
DROP DATABASE CI
I get this output when I run it on my server which has Finnish_Swedish_CS_AS
Msg 137, Level 15, State 1, Procedure sp1, Line 3
Msg 2812, Level 16, State 62, Line 1
Could not find stored procedure 'sp2'.
That is, the first procedure compiles, the second fails to compile.

My system has the default collation. At the lest, I am unaware
of having changed it. After correcting the errors noted above, I get:

Msg 134, Level 15, State 1, Procedure sp, Line 3
The variable name '@I' has already been declared. Variable names must
be unique within a query batch or stored procedure.
Msg 2812, Level 16, State 62, Line 1
Could not find stored procedure 'sp'.

Post by Erland Sommarskog

Post by Jeroen Mostert
A case-sensitive collation on your database is supremely evil.
Alright, not *supremely* evil, but very awkward for most use cases.

As this exercise shows, there is a use-case where case-sensitive
collations are essential: development environments. In my opinion,
you should always develop in a case-sensitive environment, exactly
to avoid this sort of problems.
In SQL 2012, there is a new feature that evades this problem: contained
databases. In a contained database, the collation for metadata is always the
one and the same, Latin1_General_100_CI_AS_KS_WS_SC.

Sincerely,

Gene Wirchenko

Erland Sommarskog

2012-08-17 20:39:59 UTC

Permalink

Post by Gene Wirchenko

Post by Erland Sommarskog
CREATE PROCEDURE sp1 AS

Post by Erland Sommarskog
go
EXEC sp2

^
This should be "1"

Oops! Since I expected get an error that the procedure is missing,
I did not notice that the names were different.

Post by Gene Wirchenko

Post by Erland Sommarskog
gi

^^
Not the best spelling of "go".

How did that happen? I know that I ran into that error when I tested
the repro, I corrected it. How did it end up in the news post? Did
I inadverently press CTRL-Z before I copied?

Post by Gene Wirchenko
My system has the default collation. At the lest, I am unaware
Msg 134, Level 15, State 1, Procedure sp, Line 3
be unique within a query batch or stored procedure.
Msg 2812, Level 16, State 62, Line 1
Could not find stored procedure 'sp'.

Yup. And you can test that you get the same errors if you change
the collation for the test database to be case-sensitive.

Gene Wirchenko

2012-08-17 21:14:51 UTC

Permalink

On Fri, 17 Aug 2012 22:39:59 +0200, Erland Sommarskog
[snip]

Post by Erland Sommarskog

Post by Gene Wirchenko

^^
Not the best spelling of "go".

How did that happen? I know that I ran into that error when I tested
the repro, I corrected it. How did it end up in the news post? Did
I inadverently press CTRL-Z before I copied?

Gremlins!

Post by Erland Sommarskog

Post by Gene Wirchenko
My system has the default collation. At the lest, I am unaware

^^^^
They got me, too.

[snip]

Sincerely,

Gene Wirchenko

Jeroen Mostert

2012-08-16 22:03:10 UTC

Permalink

Post by Erland Sommarskog

Obviously the procedure is not a fault, as the error is in the application.
CREATE DATABASE CI COLLATE Latin1_General_CI_AS
go
USE CI
go
CREATE PROCEDURE sp AS
go
EXEC sp
go
CREATE PROCEDURE sp1 AS
go
EXEC sp2
gi
USE tempdb
go
DROP DATABASE CI
I get this output when I run it on my server which has Finnish_Swedish_CS_AS
Msg 137, Level 15, State 1, Procedure sp1, Line 3
Msg 2812, Level 16, State 62, Line 1
Could not find stored procedure 'sp2'.
That is, the first procedure compiles, the second fails to compile.

This is a peculiar and to my knowledge undocumented thing (at least, it's
not documented in the places where I'd expect it to be documented). Seeing
as how it openly breaks database encapsulation, I consider it a serious
misfeature.

Of course, database encapsulation is already imperfect in light of tempdb
collation and suchlike (and contained databases are there precisely because
of stuff like this), but still. This just seems buggy.

Post by Erland Sommarskog

Post by Jeroen Mostert
A case-sensitive collation on your database is supremely evil.
Alright, not *supremely* evil, but very awkward for most use cases.

I'm afraid I don't follow. You should develop on a system with a
case-sensitive collation so you will run into the problems you will not have
on production because you're smart enough not to use a case-sensitive
collation there?

If you're developing a product that has to work on any server regardless of
collation, sure, this makes sense. But most developers aren't in the
business of developing general database software.

I'd probably introduce *more* errors if my server had case-sensitive
collation, as I'd have to remember to create all my databases with a
case-insensitive collation (otherwise the behavior would be seriously
different from production, with even data being case-sensitive).

Post by Erland Sommarskog
In SQL 2012, there is a new feature that evades this problem: contained
databases. In a contained database, the collation for metadata is always the
one and the same, Latin1_General_100_CI_AS_KS_WS_SC.

That's pretty nifty.

--
J.

Erland Sommarskog

2012-08-17 20:56:49 UTC

Permalink

Post by Jeroen Mostert
This is a peculiar and to my knowledge undocumented thing (at least, it's
not documented in the places where I'd expect it to be documented). Seeing
as how it openly breaks database encapsulation, I consider it a serious
misfeature.

It is the way it works. Probably out of legacy. And for the sake of
consistency. Consider:

USE CI
go
DECLARE @i int
USE CS
SELECT @I = 2

Which rules applies for the SELECT statement here?

Yes, in a stored procedure this cannot happen, but what if that SP uses
dynamic SQL?

Post by Jeroen Mostert
I'm afraid I don't follow. You should develop on a system with a
case-sensitive collation so you will run into the problems you will not
have on production because you're smart enough not to use a
case-sensitive collation there?

The post that sparked this is a good example of what could happen.
You never know what will happen on the other end.

Also consider this. Say that you have a procedure called someSP. A client
program has:

cmd.CommandText = "dbo.SomeSp"

This will lead to an extra lookup in the object catalog, because the cache
is case- and space-sensitive, so even if plan for someSP is the cache there
is no hit on the first try. Code executess, but there is an overhead.

If you development server and database is case-sensitive, you trap these
kind of errors.

Post by Jeroen Mostert
I'd probably introduce *more* errors if my server had case-sensitive
collation, as I'd have to remember to create all my databases with a
case-insensitive collation (otherwise the behavior would be seriously
different from production, with even data being case-sensitive).

Your development databases should also be case-sensitive. Yes, for testing
searches this can be a bit of a problem, but not much. Your test and QA
environments should of course follow production.