In this paper, YouTube videos are used to define the concepts of the data warehouse (DW), data mart (DM), and ETLs. To choose the videos, I paid attention to the authors, the number of subscribers and views, the informative and visual quality of the video, and its connection to health care.
Data Warehouse and Data Mart
The WCI Consulting (2016) points out that the terms DW and DM define the sets of data that are “used for reporting and analysis,” but DM is smaller than DW, and DW includes several (typically many) of them. Thus, a DM is a subset of a DW, and its purpose is to provide usable information, while “harvesting” it from the databases that are available to them.
From the information stated above, it may be assumed that the SEER website offers access to a large warehouse of various data. For instance, the GeoViewer Application (NIH., n.d.b) includes a menu of commands that offers to access its subsets (DMs). One can choose to view the data on the cervix cancer incidence in Hispanic women in 2008-2012 in Iowa. Therefore, the attributes of the mortality and incidence rates in Figure 1 can be regarded as DMs for the DW of SEER, even though this model is rather crude.
An ETL tool, as defined by Intricity101 (2011), is the tool that is used to Extract, Transform, and Load data. In other words, it ensures data transition and transformation, which may be employed for data warehousing. ETLs are critical for data management, and the choice of one of them for a database of the SEER scope is a strategic move that needs to be made with caution.
As stated by Oracle Midsize (2014), it is becoming increasingly difficult for ETLs to deal with the growing data flow. Since the ETL “transform” part is situated between “extract” and “load,” it requires a middle server that is supposed to be very “robust” and expensive. Oracle Midsize (2014) develops its data transfer tools using another technique, ELT, which shifts the transformation to the “already robust” target database.
In this case, the transformation happens after the loading, which, according to Oracle Midsize (2014) improves the performance of the ELT tool. In healthcare, costs remain a significant issue, and NIH (n.d.a) states that it is dedicated to the continuous improvement of SEER quality (para. 4). Efficient money-spending and improved data transfer and transformation tool performance are in line with this intent. Oracle Midsize (2014) specifically highlights the analytical advantages of ELT, but an operational database will also benefit from improved data transfer and transformation. Also, given the reputation of the Oracle Corporation (2016), it might be chosen as a provider of the tools for the database.
Intricity101. (2011). What is an ETL Tool? Web.
NIH. (n.d.a). About SEER. Web.
NIH. (n.d.b). GeoViewer Application. Web.
Oracle Corporation. (2016). Oracle Fact Sheet. Web.
Oracle Midsize. (2014). ELT is not ETL – Oracle Data Warehouse Solutions. Web.
WCI Consulting. (2015). What is a Data Warehouse, Data Mart & a Reporting Database (DW/DM/RDB)? Web.