Skip Navigation
Expand
Dataminer aggregation of knowledge foundation tables
Answer ID 5922   |   Last Review Date 10/24/2021

How is the data in the Knowledge Foundation tables aggregated? 

Environment:

Oracle B2C Service

Resolution:

The dataminer utility is responsible for aggregating the data from the 'clickstreams' tables into the Knowledge Foundation tables.  The data is aggregated on a per interface bases depending on the type of activity performed during a given time interval. It is important to note that Dataminer runs on end-user data that is 4 hours old. Therefore, the newest end-user statistics that are available are 4 hours old.

Hourly (window = 0):
Initially, dataminer rolls the data into one hour intervals.  This roll-up happens at the end of the next hour that dataminer executes meaning the Xth hour will be processed during the first dataminer run after the clock strikes X+1.  For example, the data from the 12:00:00 to 13:00:00 hour will be rolled into the 12:00:00 hour the next time dataminer runs after the clock hits 13:00:00. 

Daily (window = 1):
The daily roll-up occurs when dataminer runs at the end of the next day.  Therefore, the data for 08/12/16 will roll into a daily window during one of the dataminer runs at the end of the day on 08/13/16 or the beginning of the day on 08/14/16.   

Monthly (window = 2):
The monthly roll-up occurs when dataminer runs at the end of the next month meaning the data for July 2016 will roll into a monthly window during one of the dataminer runs the last day of August 2016 or the first day of September 2016. 

Yearly (window = 3):
The yearly roll-up occurs when dataminer runs at the end of the next year.  Thus, the data for 2015 will not roll into a yearly window until dataminer runs on 12/31/16 or 01/01/17.

 

Rollups are designed to aggregate data from a specific time window and store it in the database under a new time window. This is a mechanism to conserve space and keep KF tables efficient and as lightweight as possible. When performing data analysis, you will want to compare data within the same interval window. When using standard reports like "Answer Effectiveness" (ac_id 206), the window you choose will depend on the data being analyzed. Very recent data (within 30 days) should be compared using the daily filter. Data analysis between 30 days to a year should be using the monthly filter. Any long term historical analysis ( > than a year) should use the yearly window.