Discussion:
Best way to store vectors?
(too old to reply)
david
2010-07-08 21:42:01 UTC
Permalink
Hi,

Can anyone point me to a reference or discuss the best way to store a
vector of 120 to 480 numbers in the database? Rows seem to be out
since we would quickly top the billion row mark. A table with 480
columns is too unnormalized. A single varchar(max) column? This
seems the best answer for now unless there is a more efficiant way of
storing it.

Thanks for any help or opinions,

David
--CELKO--
2010-07-08 22:50:40 UTC
Permalink
I think of a vector as a particular kind of mathematical structure and
you seem to be talking about a list of some kind. Vectors have a fixed
number of dimensions, etc. Here is a guess:

CREATE TABLE Vectors
(vector_id CHAR(3) NOT NULL, --whatever
dim_nbr INTEGER NOT NULL,
CHECK (dim_nbr BETWEEN 1 AND 480),
PRIMARY KEY (vector_id, dim_nbr),
dim_val INTEGER NOT NULL);

Do you need to enforce the lower limit of 120 or not?
david
2010-07-09 13:04:36 UTC
Permalink
Post by --CELKO--
I think of a vector as a particular kind of mathematical structure and
you seem to be talking about a list of some kind. Vectors have a fixed
CREATE TABLE Vectors
(vector_id CHAR(3) NOT NULL, --whatever
 dim_nbr INTEGER NOT NULL,
   CHECK (dim_nbr BETWEEN 1 AND 480),
 PRIMARY KEY (vector_id, dim_nbr),
 dim_val INTEGER NOT NULL);
Do you need to enforce the lower limit of 120 or not?
Joe,

Thank you for your response.

The vectors I was referring to had to do with the length in months of
loans, from 10 to 40 years hence 120 to 480 numbers to be used as
adjustments.

I spoke with the senior developer today and he had already solved the
problem and tested the speed of the solution (as rows). He was going
to submit it today for review and upgrade to QA. My fault for not
checking the need after the junior guy asked for help.

Thanks again.

David
Erland Sommarskog
2010-07-09 09:36:23 UTC
Permalink
Post by david
Can anyone point me to a reference or discuss the best way to store a
vector of 120 to 480 numbers in the database? Rows seem to be out
since we would quickly top the billion row mark. A table with 480
columns is too unnormalized. A single varchar(max) column? This
seems the best answer for now unless there is a more efficiant way of
storing it.
It depends on what you want to do with the data.

There is no issue per se if you get a billion rows in a table. The table
wouldn't be that much bigger than if you instead have a 480-colunm table,
and the total size is the biggest concern. (The table would be bigger
with many narrow rows, though, since there is a certain overhead per row.)

The key is: do you have any need to access the individual points in the
vector from SQL? If you do, you should store them as rows, or else it
will be very painful.

But if all manipulation of the details of the vector will happen outside
SQL Server, there can indeed be reason for a more compact format. In
that case, I would use varbinary(MAX). An alternative that would permit
some access to indvidual points in the vector is to implement a user-
defined data type in the CLR. Assume that the points are floats, this
would only be an option if you are on SQL 2008, since UDTs are limited
to 8000 bytes on SQL 2005.
--
Erland Sommarskog, SQL Server MVP, ***@sommarskog.se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
david
2010-07-09 13:05:29 UTC
Permalink
Erland,

Thank you for your response.

The vectors I was referring to had to do with the length in months of
loans, from 10 to 40 years hence 120 to 480 numbers to be used as
adjustments.

I spoke with the senior developer today and he had already solved the
problem and tested the speed of the solution (as rows). He was going
to submit it today for review and upgrade to QA. My fault for not
checking with him after the junior guy asked for help.

Thanks again.

David

Continue reading on narkive:
Loading...