Enable Distinct Counts
Distinct count functionality determines the number of unique values in a column or expression within a selected table by comparing all the records pulled from the data store by a data source configuration. Users with editing access to data sources can enable distinct counts for specific fields. When distinct counts are enabled, unique value results are returned when analyzing data. For example, distinct counts could return the number of:
Unique customers in a sales database
Unique UPC codes for a category of products
The number of trucks in a company's fleet
For example, given a single collection and string field with the following three values:
- Apple
- Orange
- Apple
The distinct count returns 2, since there are only two distinct values (“Apple” and “Orange”), while an ordinary count returns 3 to reflect the total number of records. SQL-based connectors might produce a query that looks like this:
select count(distinct myField) from myCollection
You enable distinct counts for a field in a data source configuration when you first define the data source or, later, when you edit it.
Support for this feature by Composer connectors is shown in the following table.
Key:Y - Supported; N - Not Supported; N/A - not applicable
Connector | Supported? | Notes | ||
---|---|---|---|---|
Amazon Redshift | Y | |||
Amazon S3 | Y | |||
Apache Drill | Y | |||
Apache Phoenix | Y | |||
Apache Phoenix Query Server (QS) | Y | |||
Apache Solr | Y | |||
BigQuery | Y | |||
Cloudera Impala | Y | Cloudera Impala connectors can receive only a single distinct count field in a query. | ||
Cloudera Search | Y | |||
Couchbase | Y | |||
Dremio | Y | |||
Elasticsearch 6.0 | Y | |||
Elasticsearch 7.0 | Y | |||
Flat File | Y | |||
HDFS | Y | |||
Hive | Y | |||
MemSQL | Y | |||
Microsoft SQL Server | Y | |||
MongoDB | Y | |||
MySQL | Y | |||
Oracle | Y | |||
PostgreSQL | Y | |||
Presto | Y | |||
Real Time Sales | Y | |||
SAP Hana | Y | |||
SAP IQ | Y | |||
Snowflake | Y | |||
Spark SQL | Y | |||
Teradata | Y | |||
TIBCO DV | Y | |||
Upload API | Y | |||
Vertica | Y |
To enable distinct counts for a field:
Edit a data source configuration for a connector that supports distinct counts. See Edit a Data Source Configuration.
Select the Fields tab in the data source configuration.
Locate the field in the field table for which you want to enable distinct counts.
Select the checkbox in the Distinct Count column of the field table for the field. When an attribute field has Distinct Counts enabled, Composer treats it as a metric. See Metrics.
- Select
to save the data source configuration.
Comments
0 comments
Please sign in to leave a comment.