hlcmContentsIndex
Main
Description

Main program to invoke the LCM algorithm and compute closed frequent itemsets.

Usage :

hlcm input_data type_of_input_file support_threshold

Type of input file can be csv or num.

Support threshold correspond to the minimum number of times that the frequent pattern must appear in data to be reported (i.e., the minimum number of transactions in which it is included).

Results are dumped on stdout.

For both input types, input data must be a file with one line per transaction.

* CSV data

In case of csv data : items in transactions are strings separated by commas.

Example: the file Data/simple.csv contains:

 bread,butter
 butter,chocolate,bread
 chocolate

To know the itemsets that appear at least 2 times, invoke hlcm with :

hlcm Data/simple.csv csv 2

Result :

 HLCM, (c) Alexandre Termier 2010, from original Takeaki Uno and Hiroki Arimura LCM algorithm.
 Input file was a CSV file.

 frequency: 2  cfis: ["chocolate"]
 frequency: 2  cfis: ["bread","butter"]
 
 There are 2 closed frequent itemsets.

As expected, [chocolate] appears two times but is not frequently appearing with other items. On the other hand, the itemset [bread,butter] appears twice.

* Numerical data

In case of num data : items in the transactions are integers separated by spaces.

Example: the file Data/simple.num contains:

 1 2
 2 3 1
 3

As you can see, it is a simple transformation of simple.csv replacing text items by integers.

To know the itemsets that appear at least 2 times, invoke hlcm with :

hlcm Data/simple.num num 2

Result:

 HLCM, (c) Alexandre Termier 2010, from original Takeaki Uno and Hiroki Arimura LCM algorithm.
 Input file was a NUMERIC file.
 
 frequency: 2  cfis: [3]
 frequency: 2  cfis: [1,2]
 
 There are 2 closed frequent itemsets.
Synopsis
main :: IO ()
Documentation
main :: IO ()
Main program, parses command line, calls LCM and dumps output nicely.
Produced by Haddock version 2.6.0