Author Topic: Weird Database designs  (Read 211 times)

Walter Mitty

  • Trusted Member
  • Wise Sage
  • ******
  • Posts: 1025
    • View Profile
Weird Database designs
« on: December 05, 2011, 09:15:27 am »
This is one of the very strangest database designs I ever saw. 

I don't know how many of you are familiar with CODASYL database systems, or with what is called the "plex" structure.  As I recall, a plex structure is merely a doubly linked list of records where all the records are of one type except for one record called the header record.  The plex structure represents a set in CODASYL, where the singular record is the set owner, and the other records are the set members.

In a relational database,  sets are expressed in a much simpler and more natural way.  All the members of a set contain common values in one or more columns, and it's up to the query to put them all together.  As far as set ownership goes,  a foreign key in the child rows that references a row in the parent table is enough to establish ownership.

Both CODASYL and relational can express multiple ownership in different contexts.  This is well nigh impossible in hierarchical database systems, without resorting to tricks.

Anyways, this was a "relational database" that used keys as if they were pointers in a codasyl plex structure.  You had to navigate your way through a plex structure, one query for every row retrieved, and you had to go through all hell and gone to get the owner. 

The person who built this was either perverse or ignorant.  Nobody at the site could make head not tail of it, and the whole application was held together with bubble gum and duct tape.   I only recognized the data structure because I had briefly surveyed CODASYL DBMS before deciding not to learn it. 

What a pain!  It ran slow.  It was very difficult to program.  The data was as rigid as it would have been in a CODASYL databse. Ugh!!!!

Richardk

  • Global Moderator
  • Wise Sage
  • *****
  • Posts: 3820
    • View Profile
Re: Weird Database designs
« Reply #1 on: December 05, 2011, 10:37:07 am »
I can't speak to the why or how it got there but this certainly reminds me of my database conversion days.

My favorite was always having multiple sets in one table or put another way, different record types in one file. The 'old' system could deal with it but there was no easy way to simply dump this data into a single table.

The issue was never 'can you fix it' but rather how minimize the pain and cost of a solution.

As for linked lists and such, does anyone use that anymore? With access to a database and good tools, it seems that linked lists are an academic skill that you learn and never use in practice.

Walter Mitty

  • Trusted Member
  • Wise Sage
  • ******
  • Posts: 1025
    • View Profile
Re: Weird Database designs
« Reply #2 on: December 05, 2011, 11:45:13 am »
I can't speak to the why or how it got there but this certainly reminds me of my database conversion days.

My favorite was always having multiple sets in one table or put another way, different record types in one file. The 'old' system could deal with it but there was no easy way to simply dump this data into a single table.

The issue was never 'can you fix it' but rather how minimize the pain and cost of a solution.

As for linked lists and such, does anyone use that anymore? With access to a database and good tools, it seems that linked lists are an academic skill that you learn and never use in practice.

Well, the DB that I'm talking about was one I saw in about 1997.  But it was archaic when it was built. 

I still use linked lists, and I still know how they work.  But if I can grab a tool like Java's SortedTable,  I'll let them do the linking for me.  I don't know at what stage knowledge about data structures becomes truly academic.  I suspect that it depends on the case at hand.




Richardk

  • Global Moderator
  • Wise Sage
  • *****
  • Posts: 3820
    • View Profile
Re: Weird Database designs
« Reply #3 on: December 05, 2011, 05:01:12 pm »
I still use linked lists, and I still know how they work.  But if I can grab a tool like Java's SortedTable,  I'll let them do the linking for me.  I don't know at what stage knowledge about data structures becomes truly academic.  I suspect that it depends on the case at hand.

If you have a tool then you might as well use it. Knowledge about data structures is always useful but my point was that you don't have to build them from the ground up. In fact with no pointers in some languages, you're better off using the database instead of kludging your own solution.

Walter Mitty

  • Trusted Member
  • Wise Sage
  • ******
  • Posts: 1025
    • View Profile
Re: Weird Database designs
« Reply #4 on: December 21, 2011, 08:21:38 am »
When I got introduced to relational databases,  the relational data model was contrasted with two prior data models.  Actually, the term "data model" was applied retroactively to these two database organization plans.  The term "data model" didn't come into vogue until the introduction of the relational data model.

The two previous data models were the hierachical data model and the network data model.  In the hierarchical data model, every record in the database has room for two pointers.  The first pointer serves to locate the eldest child record, and the second pointer serves to locate the next sibling record. 
Together, the two pointers enable the collection of records to be treated like a giant tree.  Good for working with tree structured data,  but limited when working with data linked in more complex ways.

The network data model expanded on the number of pointers stored in a record.  Each record type had a certain number of pointers that were a consequence of the record definiton.  With more than two pointers, it was possible to construct arbitrarily complex webs of records.  There is a subset of the network model that was sometimes called the CODASYL model.

Here a "pointer to another record" took the form of what was sometimes called a "record file address".  This generally consisted of three numbers:  a number to identify a database file,  a number to locate a block within that file,  and a number to locate the record within that block.  I'm glossing over some details. 

Walter Mitty

  • Trusted Member
  • Wise Sage
  • ******
  • Posts: 1025
    • View Profile
Re: Weird Database designs
« Reply #5 on: December 21, 2011, 08:33:52 am »
Anyways,  what I called a "web" of records in the last response can be called a "digraph" of records or more simply a "graph". 

The graph theory of data structuring is understood by anybody who has worked extensively with pointers.  For purposes of this discussion, the difference between a memory address and a record file address is trivial. 

It's significant to me that the architects of Java chose to hide pointers from the Java programmer.  C and C++ both made extensive use of pointers,  and one of the ways you earned your stripes with those two languages was to get to the point where you could use pointers to tie together a whole bunch of data into a single "schema".  Or so I'm told by C and C++ experts. 

I'm deliberately using the word "schema" in a quirky way in here.  I'm trying to relate data structuring techniques used by programmers to data structuring techniques used by data architects.

What was the downside of hierarchical and network databases?  It wasn't speed.  In the early days of relational databases, it was not too difficult to create a "graph database"  (hierarchical or network) that did the same thing, only faster.  It would still be possible today if the same engineering effort had been put into hierarchical or network systems that has been put into relational systems.  The reason is fundamental:  It's faster to follow a given pointer than it is to start with a foreign key and have the DBMS use an index to find a pointer for you.

« Last Edit: December 21, 2011, 10:21:07 am by Walter Mitty »

Walter Mitty

  • Trusted Member
  • Wise Sage
  • ******
  • Posts: 1025
    • View Profile
Re: Weird Database designs
« Reply #6 on: December 21, 2011, 10:38:22 am »
Back to the main theme of weird database designs.

One weird design I have seen repeatedly is a column design where data from more than one domain is stuffed into a single column.  What follows is not a real case, but it  illustrates the problem nicely.

We have a column named "code value".  The form of the Column is CHAR(2).  But different kinds of two character codes are stored in this column depending on what the writer is intending to convey. 

for instance,  if the value is "CA" and the code type is state codes,  it stands for "California".  But if the value is "CA" and the code type is airline codes, it stands for "Continental Airlines".  So now we've got homonyms in one lousy column.  If we want to use this column in WHERE conditions or JOIN conditions,  we have to disambiguate or else we'll get wrong resuults.  So there's a column next to it, named "code type".  Queries use the code type column to pick out the particular kind of code you are looking for. 

Guess what happens in practice?  Programmers and other query authors forget to put the disambiguation clause in their query and they get all sorts of baffling results. 

This is just plain bad design.  The domain "state codes" and the domain "airline codes" are two different domains.  The fact that they have a common format is irrelevant.  This saves only a trivial amount of space, and it adds more complexity than adding more columns.

When you try to tell the database designer that it's not good design,  sometimes you get the following pushback.  "Oh, but it's brilliant.  It allows programmers to define new code types without updating the metadata."   The fact that updating the metadata is "difficult" is precisely the idea behind coherent shared data.  There are unusual cases where this turns out to be brilliant.  But most of the time, it's just plain stupid.


TRexx

  • Trusted Member
  • Wise Sage
  • ******
  • Posts: 4547
    • View Profile
Re: Weird Database designs
« Reply #7 on: December 21, 2011, 11:22:06 am »
How about this one.  I have a table of widgets.  Each widget has a unique 8 digit identifier and a 1 character type. According to the business rules, a widget may not exist as more than one type.   The designer combined type & widget into a single column.  So now we have a row for A12345678 and another for B12345678.  Makes accessing and updating the table lots of fun. The designer said he did that for efficiency.   I'm convinced that he generates a daily report sorted by type and widget number and doesn't know how to do a 2 column sort.


 

Walter Mitty

  • Trusted Member
  • Wise Sage
  • ******
  • Posts: 1025
    • View Profile
Re: Weird Database designs
« Reply #8 on: December 21, 2011, 02:52:28 pm »
Good one, Trexx!

How about this one.

"We know that this column should be declared as an integer,  but our interface won't let us use any dataype except text.  So we declared it as CHAR, even though it always contains a number.
You say it doesn't sort correctly  because it's left justified?  I don't know what you're talking about.
You're the Oracle expert.  Fix Oracle so that it works right!"

 

choppedwood

  • Trusted Member
  • Guru
  • ******
  • Posts: 205
    • View Profile
Re: Weird Database designs
« Reply #9 on: December 28, 2011, 12:39:17 am »
This is a long one.

I used to work with a system that had a SKU number system somewhat akin to the following:

F2R123 on up to 14 characters and they could be alpha or numeric and sometimes there were 2 digits that meant something.

So, in my example, F might be Ford, 2 would be 2-door2, R radial tires, and so on.  I used to ask people what happens when you get more than 36 options? 

To be fair, the ERP package didn't allow us to define attributes in a sane and normal way.  We were pretty much limited to what was in the ERP system when we bought it so if you needed the length of a product, and there was no attribute, the only place it could go was the SKU or in the SKU description field, which is where it usually lived, and that field was limited to 60 characters.  So, now, lets go find all the purple Ford models.  It lived in the description, but, because there were only 60 characters, and god forbid, the person who input the sku ran out of space, because it might be spelled purp, purple, prpl, prp, pp, etc.

Oh, and I almost forgot, the description field was stored 3 different ways in 3 different tables.  In one table in was a straight 60 characters, in the second it was two 30 character fields and the piece de resistance, it was stored in a 60+ character field with a new-line character in the data after the 30th character.

Their customer numbers came with their own set of problems.  They looked like 11000100.

With the first 2-digits meaning warehouse 11.  So, when they created customers they had to have the first 2-digits mean something.  Of course, every single place in the system that you'd have looked for that also had a warehouse code, well, except one which I'll talk about in a second, so you didn't need the 2-digit starting code.  Now, here's where it got interesting.  We had people who created their codes outside of the guidelines, so instead of using 11, they were using 12, oops.  But, there was still a further complication.  A customer could buy from multiple locations so we dealt with that by entering a sub-warehouse code, so that the code above, if the customer purchased from warehouse 19, would be 11000100-1901.  And, yes, you could write an order from warehouse 11 using warehouse 19's code.

Ah, but there's more.  The credit system treated each of those as individual accounts.  So, it worked something like the following:

Cust #               Credit Limit     Balance
11000100-1001  10,000           9,000
11000100-1301  10,000           5,000
11000100-1501  10,000           9,000

Total Credit we wanted to give him 10,000.  Total credit available 30,000.  And, because each of these were separate warehouses, there was a different credit person working on each of these.  The system, actually, had a further complication.  There would be something akin to 11000100-0 which, in theory, denoted the parent account and would include the total of the AR, except, it didn't.  It was another record just like all the rest except that the vendor gave it "special" meaning in a couple of places.

Before I left a lot of this was cleaned up, and, because it was an old company most people knew where the bombs were but what a mess at times.  And the worst part, it wasn't all our fault.  The vendor actually encouraged some of this behavior as workarounds because their system didn't do what we needed, or wanted, it to do.

Oh, and this was further complicated by the fact that, depending on who was in charge during my time there, we had compensation plans that paid a single rep per customer, multiple reps per customer, and different reps depending on the product that was sold to said customer.  In my last 2 years there I think the commission plan changed 4 times and no, not one time did sales go up, although several reps went from making 80K to 140K, give or take. 

And, I haven't even talked about the grand plan they developed to track inventory which left a warehouse, got on a truck, and was never seen in the system again until it landed, and, yes, it could go to a different warehouse and get off and on another truck.  If, by chance, it was returned (and believe me accuracy ceased being our core skill by the end of all this), the material on paper would go to one warehouse but physically it went to another.  No, really.  These last 2 were designed by our new regime which was going to lead us to profitability.  Absolutely insane.

By the end of all this I was pretty bitter.  It got especially bad because the new management team was making decisions on things and excluding people from their meetings who might disagree with them.  I made it worse for myself because I developed a propensity for making statements like "are you guys serious", only to find out that it was the new president's idea.  And, the actions they were taking only escalated the problem on both our ends.  I should have left 2 years before I did because the writing was on the wall that far back. 

And with that, I'm done.

« Last Edit: December 28, 2011, 12:50:00 am by choppedwood »

choppedwood

  • Trusted Member
  • Guru
  • ******
  • Posts: 205
    • View Profile
Re: Weird Database designs
« Reply #10 on: December 28, 2011, 12:42:52 am »
How about this one.  I have a table of widgets.  Each widget has a unique 8 digit identifier and a 1 character type. According to the business rules, a widget may not exist as more than one type.   The designer combined type & widget into a single column.  So now we have a row for A12345678 and another for B12345678.  Makes accessing and updating the table lots of fun. The designer said he did that for efficiency.   I'm convinced that he generates a daily report sorted by type and widget number and doesn't know how to do a 2 column sort.

Just to add to my long post, because this is the one that prompted it.  When I asked people why the first 2-digits of the customer number had to be the warehouse, they said, and you said it, "because it sorted better in a couple of places in the system".

Walter Mitty

  • Trusted Member
  • Wise Sage
  • ******
  • Posts: 1025
    • View Profile
Re: Weird Database designs
« Reply #11 on: December 28, 2011, 07:28:04 am »
Sigh.  I have to plead guilty on using partial keys to impose a natural order on the items in a set. 


Whenver I've put together a transaction oriented item,  like transactions in a general ledger system,  I've used the item number to do two things:  first,  to identify distinct items in the same transaction.  And second, to specify the order in which they should appear within reports of transaction details. 

I'll plead not guilty, however, to smashing the partial key into the first two characters of a bigger key. 

Speaking of non meaningless keys,  how about VIN?  There is a whole lot of info coded into subfields of the VIN.  And some of those are mutable, ones that could change over the lifespan of the vehicle. 

choppedwood

  • Trusted Member
  • Guru
  • ******
  • Posts: 205
    • View Profile
Re: Weird Database designs
« Reply #12 on: December 28, 2011, 01:53:01 pm »
I can, apparently, write about this system for hours.  So, on ordering things, in this case, ordering it in the data.  Hope I didn't already use this one.

At one point we wanted to set up some sales goals in the ERP and I'd pull them from the system for our MySQL data warehouse.  The input fields looked something like:

Year:
Sales Code;
Period 1
Period 2
Period 3
.
.
.
.

The periods very clearly corresponded to months in the year on the entry screen.

So, naturally when I went to pull the data for a sales rep I wound up with.

Year,SLM Code,
Year,SLM Code,'Period 1 Sales'
Year,SLM Code,'Period 2 Sales'
Year,SLM Code,'Period 3 Sales'
.
.
.
.

Yup, that's right, there was no month in the data and the first field was blank.  It turned out that Progress has an array data type and they had used it to store the monthly sales goal according to it's position in the array, excepting for 0, which was empty.  I could actually query the data using an array sales[2] but that was useless for a bulk export.  I wound up hard-coding the report on the other end to work with this as the tool we used to export the data couldn't have produced a counter.


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf