Question
Is it a good rule of thumb for database IDs to be "meaningless?" Conversely, are there significant benefits from having IDs structured in a way where they can be recognized at a glance? What are the pros and cons?
Background
I just had a debate with my coworkers about the consistency of the IDs in our database. We have a data-driven application that leverages spring so that we rarely ever have to change code. That means, if there's a problem, a data change is usually the solution.
My argument was that by making IDs consistent and readable, we save ourselves significant time and headaches, long term. Once the IDs are set, they don't have to change often and if done right, future changes won't be difficult. My coworkers position was that IDs should never matter. Encoding information into the ID violates DB design policies and keeping them orderly requires extra work that, "we don't have time for." I can't find anything online to support either position. So I'm turning to all the gurus here at SA!
Example
Imagine this simplified list of database records representing food in a grocery store, the first set represents data that has meaning encoded in the IDs, while the second does not:
ID's with meaning:
Type
1 Fruit
2 Veggie
Product
101 Apple
102 Banana
103 Orange
201 Lettuce
202 Onion
203 Carrot
Location
41 Aisle four top shelf
42 Aisle four bottom shelf
51 Aisle five top shelf
52 Aisle five bottom shelf
ProductLocation
10141 Apple on aisle four top shelf
10241 Banana on aisle four top shelf
//just by reading the ids, it's easy to recongnize that these are both Fruit on Aisle 4
ID's without meaning:
Type
1 Fruit
2 Veggie
Product
1 Apple
2 Banana
3 Orange
4 Lettuce
5 Onion
6 Carrot
Location
1 Aisle four top shelf
2 Aisle four bottom shelf
3 Aisle five top shelf
4 Aisle five bottom shelf
ProductLocation
1 Apple on aisle four top shelf
2 Banana on aisle four top shelf
//given the IDs, it's harder to see that these are both fruit on aisle 4
Summary
What are the pros and cons of keeping IDs readable and consistent? Which approach do you generally prefer and why? Is there an accepted industry best-practice?
-------- edit ( helpful background info from comments, below ): --------
In our tables, the Primary Key is always an ID field containing a unique integer. At first, that integer was arbitrary. Over time, some of these IDs naturally took on meaning among developers/testers. During a recent refactor, certain developers also took time to make all IDs easier to recognize. It made everyone's job 100X easier. Some people (who don't actually use the data/code) vehemently disagreed for theoretical reasons. In practice, not one of those objections are holding true. Moreover, all developers using the data agree that it's now significantly easier to maintain.
I'm looking for (but haven't seen) a defensible argument against using immediately recognizable IDs in a data-centric environment.
There are several problems with using database IDs to encode information about a row. If you want your carrots to have an "ID" of 203, you should add a
product_id
column (for example) and put this information there instead. Why?The only required purpose of an ID is to uniquely identify a row within a table. If it can provide good lookup performance, that's a bonus, and if it can be compactly stored, that's another bonus. But it shouldn't contain any information about the entity in the row it identifies, other than the unique identifier of that entity.