Dealing with Duplicate Values
Causes of Duplicate Values
Be judicious when adding data objects. If the report has unexpected duplicate values or empty rows, the cause is most likely that there is a one-to-many join to an object not being used.
Resolving Duplicate Values
Consider a report that lists the students that are advised by professors at a college. This requires the Professors and Students data objects, which have a one-to-many relationship as one professor may advise several students. Adding fields from both of these data objects to a report will result in duplicate data.
There are several ways to handle situations like this:
- Utilize Group Sections
- Apply Suppress Duplicates/Hide Repeated Values to the Data Objects
- Apply Suppress Duplicates/Hide Repeated Values to the cell
- Address the Data Model
1. Utilize Group Sections
Group Header & Group Footer sections can be used to display data from the “one” side of a “one-to-many” relationship. For reports that show several one-to-many relationships, use Repeating Groups.
This feature checks unique keys in the data objects so that a professor only appears once on the report. Since each professor is identified with a unique key in the data source, even if there is a professor with the same name, they will appear since they are not the same professor.
See Advanced Reports: Sections (v2021.1+).
2. Apply Suppress Duplicates/Hide Repeated Values to a Category/Data Object
The Suppress Duplicates/Hide Repeated Values checkbox in the Manage Data Objects dialog will hide any data from a duplicated record from appearing on the report. This is a visibility change only—there is no change to aggregation, SQL generation or the number of rows returned from the data source.
This feature does not check unique keys in the data objects, it looks at only the values displayed in the cells. Caution should be exercised when using this method if there could be unique duplicate data. For example, if there is another professor named Joyce Anderson, that professor’s name would not appear on the report output.
3. Apply Suppress Duplicates/Hide Repeated Values to a Cell
The Suppress Duplicates/Hide Repeated Values icon on the toolbar will suppress only the value in the cell it is applied to if the previous value is the same. This is a visibility change only—there is no change to aggregation, SQL generation or the number of rows returned from the data source.
This feature does not check unique keys in the data objects, it looks at only the value displayed in the cell. Caution should be exercised when using this method if there could be unique duplicate data. For example, if there is another professor with last name Anderson, that professor’s name would not appear on the report list.
Since the cells where the duplicates have been removed are empty, this can be tested for with a conditional formatting formula for other report manipulation.
4. Address the Data Model
If still seeing duplicates after trying the above options, it is likely a result of a Cartesian product. Consider:
- modifying the joins on the report or the data objects on it to return only the necessary data
- ensure that any one-to-many joins are configured as such in the configuration and ensuring that Special Cartesian Processing is True in the Joins dialog. See Advanced Reports: Joins.