In Apache Hive the COLLECT_SET is an aggregate function that allows you to collect unique values from multiple rows into array. In Trino you can use ARRAY_AGG(DISTINCT) function:
Apache Hive:
-- To avoid error: IllegalArgumentException Size requested for unknown type: java.util.Collection SET hive.map.aggr = false; -- Duplicate value 'red' will be removed SELECT COLLECT_SET(color) FROM ( SELECT 'red' AS color UNION ALL SELECT 'white' AS color UNION ALL SELECT 'black' AS color UNION ALL SELECT 'red' AS color ) t; # ["red","white","black"]
Trino:
-- Duplicate value 'red' will be removed SELECT ARRAY_AGG(DISTINCT color) FROM ( SELECT 'red' AS color UNION ALL SELECT 'white' AS color UNION ALL SELECT 'black' AS color UNION ALL SELECT 'red' AS color ) t; # [black, white, red]
For more information, see Apache Hive to Trino Migration.