Many relational database engines support views. Views can act as a kind of virtual tables and using them can sometimes greatly improve the architecture and the testability of the system, but of course, there might be the price to pay.

Common simple calculations

You might encounter a situation when the table you would like to use in the rest of your system contains data that might kept is such a way that is not suitable for direct consumption for a specific purpose.

For example, in a double-entry accounting system a general ledger table might keep the booking amounts as positive decimal numbers across all rows and the sign of the amount is actually determined by some other column that keeps the credit or the debit side of the entry.

Just this fact that the booking value is encoded in a split way like this and the actual value for many usage purposes needs to be in the context of the booking side calls for a simple calculation expression to be made that will be applied for every row of the underlying table. A simple view that includes the newly calculated value based on the amount and the credit or debit side could be a perfect fit. Once written and later used in many places. As you can imagine, one price to pay here can be the performance. The database engine needs to apply the said calculation for each row retrieved through this view, but again, this is simple sign changing expression done on the data that is local to a row.

Concise version of the data

Tables can generally hold many columns that serve as attributes of the item that is described within a row. Specific usages of the data might not require all of those columns always. View can help here to select only the columns needed for a specific usage scenario and in the preferred column sequence.

When examining the data from a wide table it could be very hard to get a clear picture of what is in there. If the table has hundred columns and the only ones you absolutely need are columns in positions 1, 89, 16 and 74 and you want them in that order because they matter the most, plus few simple calculated columns and few other columns that are not so much important but still helpful for this scenario.

I prefer to write views like this - put important columns (or calculated expressions) first and nicely ordered, then few helpful but non-essential ones and nothing more.

There are also caveats. For example, you’re using SQL Server and create a view like this without the SchemaBinding attribute. Everything works fine until you change the definition of the underlying table, say insert new column in the middle of the column list for that table. Suddenly, your view starts to output some strange data, like ‘Mr.’ or ‘Mrs.’ in the fist_name column. The engine actually saves the metadata of the view when it’s created with the accessed table columns referred by the index of the column, not the name of the column. Some subsequent table modifications can later lead to this problem. You could protect yourself from this scenario by applying SchemaBinding attribute, but it also leads to a management problems when the underlying table(s) really need to change as it forces you to drop the entire set of schema bound views before the table modification and recreate them later (also in correct order, if the depend on one another). Other restrictions that come with using that attribute also apply. But then again, modifying tables by inserting columns in the middle of the column list is not a lightweight operation anyways and should occur rarely. I prefer to not use SchemaBinding attribute and use script to automatically refresh internal metadata definitions of all views by using system procedure sp_refreshview. Which also may come with caveats – some versions of the SQL server may revert the view definition to an older version if someone renamed it; also not applicable for the Azure cloud server versions.

Organizing code

SQL Server engines supports schemas within a database as a logical unit that can help organize the objects and apply security. When I’m working with an unfamiliar database and have permissions to create new stuff I usually first create ‘help’ schema and start creating helper views and other objects within it. The rationale behind this is simple – examining the model of the database takes time and as you go and learn pieces of the model it’s much easier to wrap up interesting queries into views that help you navigate the data faster and possibly expose some anomalies within the model or the data within.

The idea is also that these helper views are really just that – they are not used in other parts of the system and nothing depends on them. You can edit them as you learn new facts or drop them completely if you find them not useful. Within such a view you can easily put comments that contain examples of how to select interesting cases from your research. This can jog your memory much faster when you or your colleagues revisit such view much later and recall the reason of why and how something was done. There’s no need to search the files, the help is just there, within the context of a particular database you’re already connected to.

Another example of using views is to create a sort of “interface” between the database and the calling client API’s and applications. This is certainly not a requirement, but as a system grows in complexity, it can be beneficial to have layer of abstraction from the raw tables of the database. You could place views in and ‘api’ or ‘app’ schema and agree with your coworkers to use those schemas from the application side. One benefit of this approach is the easier testability of any particular view, not only of the data correctness but of the retrieval performance, too. As data volumes grow, you can test the performance and execution plans of such a view and possibly spot a need for adding or changing the indexes that would help to speed things up.

Of course, there could also be downsides of this approach, as you would have to manage yet another layer of abstraction and keep it in sync with other parts of the system.